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<S (54) Title: FLUORESCENT PROTEINS 

Q (57) Abstract: There is disclosed an isolated nucleic acid molecule encoding a new florescent protein which is capable of emitting 
^ fluorescence upon iiradiation by incident light, wherein said maxima] absoibance of incident light is in the range of 440-480mm. 
^ and maximal fluorescence emission is in the range of 470-510mm. 
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FLUORESCENT PROTEINS 

The present invention is concerned with fluorescent 
proteins and, in particular, with nucleic acid 
5 sequences encoding novel fluorescing proteins which 
have been isolated from coral species. 

Fluorescent proteins, such as, green fluorescent 
protein from the luminescent jelly fish Aequorea 

10 i-victoria are extremely- useful molecules by virtue of 
their ability to function as markers for gene 
expression and protein localisation within living 
cells. Fluorescent proteins can be produced in vivo 
by biological systems and can therefore be used to 

15 monitor and trace the progress of intracellular 
events . 

In the present invention, the inventors have 
surprisingly identified completely novel fluorescing 
20 proteins from the coral species Anthozoa which have 
been sequenced and which can be used for in vivo 
labelling studies. 

Therefore, according to a first aspect of the 
25 invention there is provided an isolated nucleic acid 
molecule encoding a fluorescent protein comprising an 
amino acid sequence illustrated in any of the 
polypeptide sequences of figures 3(a) to 3(d). The 
present inventors have advantageously identified 4 
30 distinct nucleic acid molecules encoding fluorescing 

proteins which heretofore have not yet been described . 
In a further aspect, the invention comprises an 
isolated nucleic acid molecule encoding a protein 
capable of emitting fluorescence upon irradiation by 
35 incident light, wherein said maximal absorbance of 
said incident light is in the range 4 4 0-480 nm, in 



CONFIRMATION COPY 
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particular 450-475 (maximum of excitation) and maximal 
fluorescence emission is in the range 470-510 nm, in 
particular 480-500 nm (maximum of emission) , 

5 According to the invention, at least 4 different 
fluorescent proteins (and nucleic acid sequences 
encoding said proteins) were obtained from species of 
coral, and in particular from species of coral 
belonging to the genus Discosoma and the genus 
10 ! folythoa. 

In addition^ as can be seen from the data given 
hereinbelow, hybrids of fluorescent proteins derived 
from two or more different species from the genus 

15 Polythoa and/or Discosoma may also be used. Such 

hybrid fluorescent proteins of the invention may be 
obtained by suitable expression of hybrid 
(e.g. chimeric) nucleic acid sequences encoding such 
hybrid proteins, which in turn may for instance be 

20 obtained by suitably combining (two or more parts of) 
two or more naturally occurring nucleic acids (i.e. 
cDNAs and/or genes) encoding (native) fluorescent 
proteins, at least one of which has been obtained from 
a coral of the species Polythoa and/or Discosoma 

25 (and/or from another coral). This can be carried out 

by techniques known per se and/or as further described 
below, including but not limited to ^^gene shuffling'' 
techniques. 

30 A listing of the clones used in the invention is given 
in Figure 2. Also, an alignment of some of the clones 
used herein is given in Figure 8B. 



The excitation- and emission-spectra for some of these 
proteins are given in the Figures, and are also 
summarized below: 
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Clone 



Source 



Mutations 



Exitation 
max (nm) 



Emission 
max (nm) 



pGR7 Polythoa spec. 

pGR3 Polythoa spec. 

pGR13 Polythoa spec 

pGRlS Hybrid 



Q135R 

N41D, 3' end 

none 

none 



469 (452) 490 

469 (452, 489) 496 

469 (452) 490 

45r (440) 484 



Accordingly, in one embodiment, the invention relates 
10 to a fluorescent protein with an emission spectrum 
which has: 

V 

- a maximum of emission (fluorescence - measured 
following exitation at 469 nm) at between 491 and 501 
15 nm^ and in particular at about 4 96 nm; 

and preferably one, and more preferably both, of the 
following: 

an emission at 480 nm (fluorescence - measured 
20 following exitation at 469 nm) of between 30 and 40 % 
of the emission at the maximum of emission; 



- an emission at 525 nm (fluorescence - measured 
following exitation at 469 nm) of between 35 and 45 % 

25 of the emission at the maximum of emission; 
and with an exitation spectrum which has: 

- a maximum of absorbance (measured at emission at 
490 nm) at between 464 and 474 nm, and in particular 

30 at about 469 nm; and at least any one, preferably at 

least any two, more preferably at least any three, and 
most preferably all four of the following: 

- an absorbance at 452 nm (measured at emission at 
35 490 nm) of between 59 and 69 % of the absorbance at 

the maximum of absorbance; 
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an absorbance at 456 run (measured at emission at 
490 nm) of between 54 and 64 % of the absorbance at 
the maximum of absorbance; 



5 - an absorbance at 486 nm (measured at emission at 
490 nm) of between 42 and 52 % of the absorbance at 
the maximum of absorbance; 

an absorbance at 489 nm (measured at emission at 
10 J 90 nm) of between 63 and 73 % of the absorbance at 
'the maximum of absorbance. 



In another embodiment, the invention relates to a 
fluorescent protein with an emission spectrum which 
15 has: 

a maximum of emission (fluorescence - measured 
• -following exitation at 469 nm) at between 485 and 495 
nm, and in particular at about 490 nm, 
20 and preferably one, and more preferably both, of the 
following: 

an emission at 480 nm (fluorescence - measured 
following exitation at 4 69 nm) of between 4 6 and 56 % 
25 of the emission at the maximum of emission; 

an emission at 525 nm (fluorescence - measured 
following exitation at 469 nm) of between 33 and 43 % 
of the emission at the maximum of emission; 
30 and with an exitation spectrum which has: 

a maximum of absorbance (measured at emission at 
490 nm) at between 464 and 474 nm, and in particular 
at about 469 nm; and at least any one, preferably at 
35 least any two, more preferably at least any three, and 
most preferably all four of the following: 
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an absorbance at 440 nm (measured at emission at 
490 nm) of between 48 and 58 % of the absorbance at 
the maximum of absorbance; 

5 - an absorbance at 452 nm (measured at emission at 
490 nm) of between 55 and 65 % of the absorbance at 
the maximum of absorbance; 

an absorbance at 456 nm (measured at emission at 
490 nm) of between 52 and 62 % of the absorbance at 
10 . ^the maximum of absorbance; 

an absorbance at 480 nm (measured at emission at 
490 nm) of between 48 and 58 % of the absorbance at 
the maximum of absorbance. 

15 

In yet another embodiment, the invention relates to a 
fluorescent protein with an emission spectrum which 

I I 
'"has : 

20 - a maximum of emission (fluorescence - measured 

following exitation at 451 nm) at between 479 and 48 9 
nm, and in particular at about 484 nm, and preferably 
one, and more preferably both, of the following: 

25 - an emission at 470 nm (fluorescence - measured 

following exitation at 451 nm) of between 39 and 49 % 
of the emission at the maximum of emission; 

an emission at 525 nm (fluorescence - measured 
30 following exitation at 451 nm) of between 31 and 41 % 
of the emission at the maximum of emission; 
and with an exitation spectrum which has: 

a maximum of absorbance (measured at emission at 
35 484 nm) at between 446 and 456 nm, and in particular 
at about 451 nm; and at least any one, preferably at 
least any two, more preferably at least any three, and 
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most preferably all four of the following: 



an absorbance at 420 nm (measured at emission at 

484 nm) of between 61 and 71 % of the absorbance at 
5 the maximum of absorbance; 

an absorbance at 440 nm (measured at emission at 

484 nm) of between 86 and 96 % of the absorbance at 
the maximum of absorbance; 

10 J an absorbance at 447 nm (measured at emission at 
484 nm) of between 84 and 94 % of the absorbance laV ! • 
the maximum of absorbance; 

an absorbance at 470 nm (measured at emission at 
15 484 nm) of between 61 and 71 % of the absorbance at 
the maximum of absorbance. ' 

:/'Also, any protein with an emission and/or exitation 
spectrvon as indicated above preferably has a- degree 

20 of sequence identity with at least one of the proteins 
encoded by the nucleic acid sequences shown in Figure 
1, of at least 70%, preferably at least 80%, more 
preferably at least 90% and even more preferably at 
least 95% sequence identity with at least one of the 

25 proteins encoded by at least one of the nucleotide 
sequences depicted in Figure 1, in which the 
percentage sequence homology is determined as 
described hereinbelow. 

30 For the some of the clones described hereinbelow, 
pertinent values are given in Figure 28. 

Preferably, the nucleic acid molecule is a DNA and 
more preferably a cDNA molecule. The cDNA molecules 
35 are preferably isolated from the Discosoma or Polythoa 
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genus of coral although they may also be synthetically 
prepared using techniques which would be well known to 
practitioners skilled in the art. Preferably, the 
nucleic acid sequences encoding the novel proteins are 
5 as set forth in Figure 1. 

Preferably, the nucleic acid molecule is substantially 
homologous to the nucleic acid sequences depicted in 
Figure 1. Even more preferably the nucleic acid 

10 - molecule has at least 70, preferably at least 80,. imore 
preferably at least 90 and even more preferably at 
least 95% sequence identity to at least one of the 
nucleic acid sequences depicted in Figures 1 and even 
more preferably comprises any of the nucleic acid 

15 sequences of Figure 1. 

' The fluorescent proteins of the invention can be used 
for any application known per se for fluorescent 
proteins described in the art, such as for the green 

20 fluorescent protein from Aequorea victoria mentioned 
above. Such applications will be clear to the skilled 
person, and may include, but are not limited to, the 
applications of such ''GFPs" mentioned in the relevant 
prior art, such as WO 95/07463, WO 97/11094, WO 

25 97/42320, WO 98/06737 and WO 97/41228. 

As such, the fluorescent proteins of the invention 
(and/or the nucleic acid sequences encoding these 
proteins) may be used as a label and/or marker, and in 
30 particular as a genetic marker and/or an expression 

marker, for instance in the fields of (micro-) biology, 
biochemistry and/or molecular biology - 



For example, the fluorescent proteins of the 
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inventions (and/or the nucleic acid sequences encoding 
these proteins) may be used in in vitro applications, 
such as hybrisation assays and/or immunological assays 
(e.g. ELISA' s) - 

5 

However, fluorescent proteins of the invention are 
particularly suited for applications in vivo, 
including but not limited to expression and/or use in 
bacteria, protozoa, fungi, algi, yeast cells or other 
10 -micro-organisms; in (cells or tissues of) plants 

and/or animals; and/or in cells or cell lines derived 
from plant cells or animal cells. 

One particularly preferred application involves the 
15 expression and use in species of nematode, such as 
C.eleganSf e.g. for screens or assays involving the 
• • use of such nematodes . 

Some other possible applications include, but are not 
20 limited to: 

follow up of a protein tagged with a fluorescent 
protein during the purification of said protein ( e.g. 
using chromatography techniques); 

25 

- in vivo expression analysis; 

investigation of the transport of proteins etc. 
across biological membranes; and/or (other) 
30 qualitative and/or quantitative detection techniques 
and/or analytical techniques. 

The nucleic acid molecules of the present invention 
are particularly useful in processes for labelling 
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polypeptides of interest, e.g., by the construction of 
genes encoding fluorescent fusion proteins. 
Fluorescence labelling via gene fusion is site- 
specific and eliminates the present need to purify the 
5 labelled proteins in vitro and microinject them into 

cells. Sequences encoding the fluorescing proteins of 
the present invention may be used for a wide variety 
of purposes as are well known to those working in the 
field. For example , the sequences may be employed as 

10 ♦ reporter genes for monitoring the expression of the 
sequence fused thereto; unlike other reporter genes, 
the sequences require neither substrates nor cell 
disruption to evaluate whether expression has been 
achieved. Similarlyr the sequences of the present 

15 invention may be used as a means to trace lineage of a 
gene fused thereto during the development of a cell or 
organism. Further, the sequences of the present 
invention may be used as a genetic marker; cells or 
organisms labelled in this manner can be selected by 

20 e.g. fluorescence-activated cell sorting^ The 

sequences of the present invention may also be used as 
a fluorescent tag to monitor protein .expression in 
vivo and/or in vitro or to encode donors or acceptors 
for fluorescence resonance energy transfer. Other 

25 uses for the sequences of the present invention would 
be readily apparent to those working in the field, as 
would appropriate techniques for fusing a gene of 
interest to an oligonucleotide sequence of the present 
invention in the proper reading frame and in a 

30 suitable expression vector so as to achieve expression 
of the combined sequence. 

Similarly fusion proteins including an antibody fused 
to the fluorescing protein may also be generated for 
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10 



15 



20 



25 



in vivo labelling, for example. In such an embodiment 
the nucleic acid molecule of the invention encoding 
the fluorescing protein will be operably linked to the 
sequence encoding the antibody. As would be well 
known in the art only a small portion of an antibody 
molecule, the paratope, is involved in binding to the 
epitope of a protein and a nucleic acid molecule 
encoding the paratope may be used to generate a 
labelled molecule specific for the paratope of 
Interest- -^^ 

A fusion protein of the 3' sequence of Discosoma 
coupled to the 5' sequence of Polythoa 2 was also 
generated using the nucleic acid sequences encoding 
the Polythoa 2 and Discosoma 1 protein^ for expression 
in a prokaryotic and eukaryotic expression system, 
which protein sequences are illustrated in Figure 7. 
The plasmid pGR15 encoding the sequence of the 
Polythoa 2-Discosoma 1 hybrid was the vector used for 
expression of the fusion protein in E.coli, whereas 
plasmid pGRl8 was utilised for eukaryotic expression 
in COS cells. Plasmid pGR20 was used for expression 
in C.elegans and transformation of the relevant cells 
or organism using these vectors resulted in expression 
of a fluorescing protein. 

As outlined in more detail in the examples below, 
mutant or hybrid proteins were also developed to 
investigate their absorbance and emission spectra 
compared to the wild type Polythoa and Discosoma 
proteins. The proteins and polypeptides encoded by 
plasmids pGR3 and pGR7 described herein contain a 109 
thioredoxin associated fragment in fusion with the 
Polythoa 2 fluorescing protein. Furthermore, plasmid 
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pGR7 encodes a protein with the mutation Q136R while a 
further plasmid pGRlO expresses a I106T mutant. 

An antisense molecule capable of hybridising to the 
5 nucleic acid molecules of the invention under 

conditions of high stringency also forms part of the 
invention. 

Stringency of hybridisation as used herein refers to 
10 '^'^conditions under which polynucleic acids are stable.. 
The stability of hybrids is reflected in the melting 
temperature (Tm) of the hybrids. Tm can be 
approximated by the formula: 

15 81 .5^0+16. 6 (logioLNa^] +0.41 (%G&C) -600/1 

wherein 1 is the length of the hybrids in nucleotides. 
Tm decreases approximately by 1-1.5°C with every 1% 
decrease in sequence homology. 

20 

The term ^^stringency" refers to the hybridisation 
conditions wherein a single-stranded nucleic acid 
joins with a complementary strand when the purine or 
pyrimidine bases therein pair with their corresponding 
25 base by hydrogen bonding. High stringency conditions 
favour homologous base pairing whereas low stringency 
conditions favour non-homologous base pairing. 

^^Low stringency" conditions comprise, for example, a 
30 temperature of about ST'C or less, a f ormamide 

concentration of less than about 50%, and a moderate 
to low salt (SSC) concentration; or, alternatively, a 
temperature of about 50°C or less, and a moderate to 
high salt (SSPE) concentration, for example IM NaCl. 
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''High stringency" conditions comprise/ for example, a 
temperature of about 42**C or less, a formamide 
concentration of less than about 20%, and a low salt 
(SSC) concentration; or, alternatively, a temperature 
5 of about eS'^C, or less, and a low salt (SSPE) 
concentration. For example, high stringency 
conditions comprise hybridization in 0.5 M NaHP04, 7% 
sodium dodecyl sulfate (SDS) , 1 mM EDTA at SS^'C 
(Ausubel, F.M- et al- Current Protocols in Molecular 
10 " ^Biology , Vol. I, 1989; Green Inc. New York, , at 
2.10.3) . 

''SSC" comprises a hybridization and wash solution. A 
stock 20X SSC solution contains 3M sodium chloride, 
15 0.3M sodium citrate, pH 7.0. 

''SSPE" comprises a hybridization and wash solution. A 
IX SSPE solution contains 180 mM NaCl, 9mM Na2HP04 and 
1 mM EDTA, pH 7.4. 

20 

However, other conditions and reagents also result in 
stringent hybridisation conditions and these are 
generally well known to the skilled practitioner 
(Molecular Cloning A Laboratory Manual, J. Sambrook et 
25 al.. Cold Spring Harbour Press, 1989, or Current 

Protocols in Molecular Biology, F.M. Ansubel, et al., 
eds., John Wiley & Sons Inc., New York. 

As would be appreciated by those skilled in the art, 
30 the presence of introns in a nucleic acid sequence can 
lead to enhanced expression levels. One of the 
preferred nucleic acid molecules of the invention, the 
sequence of which is depicted in Figure 2(b), includes 
a synthetic intronin addition to a 5' UTR including a 
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Kozak site. 

Fluorescent proteins or functional equivalents, 
fragments or variants thereof encoded by the nucleic 
5 acid molecules of the invention also form part of the 
invention- Furthermore, according to an even further 
aspect/ the invention comprises an isolated 
fluorescent protein capable of emitting fluorescence 
upon irradiation by incident light wherein the maximal 

10 absorbance of said incident light is in- the. rang©'•i^440- 
480 nm, in particular 450-475 nm (maximum of 
excitation) and maximal fluorescence emission is in 
the range 470-510 nm, in particular 480-500 nm 
(maximum of emission) . The invention also comprises 

15 an isolated fluorescent protein comprising an amino 
acid sequence which has at least 70, preferably at 
least 80, more preferably at least 90 and even more 
preferably at least 95% sequence identity to the amino 
acid sequence depicted in any of Figures 3 to 8. 



20 



Functional equivalents, fragments or variants of the 
polypeptide of the invention are those molecules that 
retain the distinct fluorescing capability of the 
polypeptides of the invention. 



25 



The DNA molecules according to the invention may, 
advantageously, be included in a suitable expression 
vector to express the fluorescent protein encoded 
therefrom in a suitable host. Incorporation of cloned 
30 DNA into a suitable expression vector for subsequent 
transformation of said cell and subsequent selection 
of the transformed cells is well known to those 
skilled in the art as provided in Sambrook et ai. 
(1989), Molecular Cloning, A Laboratory Manual, Cold 
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Spring Harbour Laboratory Press. 



. An expression vector according to the invention 
includes a vector comprising a nucleic acid according 
5 to the invention operably linked to regulatory 

sequences, such as promoter regions, that are capable 
of effecting expression of said DNA fragments- The 
term ''operably linked'' refers to a juxta position 
wherein the components described are in a relationship 
10 "^permitting them to function in their intended mannar. 
Such vectors may be transformed into a suitable host 
cell to provide for expression of a polypeptide 
according to the invention. Thus, in a further 
aspect, the invention provides a process for preparing 
15 polypeptides according to the invention which 

comprises cultivating a host cell, transformed or 
transfected with an expression vector as described 
above under conditions to provide for expression by 
the vector of a coding sequence encoding the 
20 polypeptides, and recovering the expressed 
polypeptides. 

The vectors may be, for example, plasmid, virus or 
phage vectors provided with an origin of replication, 
and optionally a promoter for the expression of said 
nucleotide sequence and optionally a regulator of the 
promoter. The vectors may contain one or more 
selectable markers, such as, for example, ampicillin 
resistance- 

The precise nature of the regulatory sequences needed 
for expression of the fluorescing protein can vary . 
between species or cell types. They will, however, 
generally include 5' non-transcribing and 5' non- 



25 



30 
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translating sequences involved in initiation or 
regulation of transcription and translation 
respectively. Regulatory elements required for 
expression generally include promoter sequences to 
5 bind RNA polymerase and transcription initiation 
sequences for ribosome binding. For example, a 
bacterial expression vector may include a promoter 
such as the lac promoter and for translation 
initiation the Shine-Dalgarno sequence and the start 

10 '^codon AUG. Similarly, a eukaryotic expression vector 
may include a heterologous or homologous promoter for 
RNA polymerase II, a downstream polyadenylation 
signal, the start codon AUG, and a termination codon 
for detachment of the ribosome. Such vectors may be 

15 obtained commercially or assembled from the sequences 
described by methods well known in the art. 

Nucleic acid molecules according to the invention may 
be inserted into the vectors described in an antisense 
20 orientation in order to provide for the production of 
antisense RNA. Antisense RNA or other antisense 
nucleic acids may be produced by synthetic means. 

As discussed in the examples provided it is desirable 
25 to enhance the performance or expression levels of the 
fluorescent proteins in organisms or cells other than 
those from the coral species from which the proteins 
or polypeptides of the invention are derived. Every 
organism adopts a preferred codon usage which is 
30 related to the presence and expression of tRNA genes 
and which involves post-transcriptional expression 
regulation. Such optimal codon usage has been 
determined for a number of organisms. In the present 
embodiment a vector was generated for optimal 



WO0242323 fhttp://wvw.getthepaten t.CQm/Login.do^^n k99/Fetch/WO024^ Page 17 of 96 

WO 02/42323 PCT/EPOl/13604 

- 16 - 

expression in the nematode C.elegans. Therefore, when 
the host to be transfected with a vector including the 
nucleic acid molecules of the invention is C.elegans, 
the vector may comprise the plasmid pGRlO, described 
5 in the example below, which includes the nucleotide 
sequence depicted in Figure 2(a). 

Similarly, the introduction of synthetic introns can 
result in enhancements of expression levels. A 
10 -preferred nucleic acid molecule including such a^^- 
synthetic intron for increased expression levels in 
C.eJegans is particularly preferred, which molecule is 
described in Figure 2(b). 

15 Preferred vectors according to the invention comprise 
the plamsids designated pGR3, pGR4, pGR5, pGR6, pGR7 
and pDW2700, the sequences of which are illustrated in 
Figures 9 to 14 respectively. Other preferred 
plasmids according to the invention comprise plasmids 

20 designated pGRl, pGR8, pGR13, pGRl4, pGR15, pGR16, 

GR17, pGR18, pGR19, pGR20 and pGRlO identified in the 
example provided, and which would be readily 
producible by the skilled practitioner using the 
method steps described. 

25 

In accordance with the present invention, a defined 
nucleic acid includes not only the identical nucleic 
acid but also any minor base variations including in 
particular, substitutions in cases which result in a 
30 synonymous codon (a different codon specifying the 

same amino acid residue) due to the degenerate code in 
conservative amino acid substitutions. The term 
"nucleic acid sequence" also includes the 
complementary sequence to any single stranded sequence 
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given regarding base variations • 

The present invention also advantageously provides 
nucleic acid sequences of at least approximately 10 
5 contiguous nucleotides of a nucleic acid according to 
the invention and preferably from 10 to 50 nucleotides 
of the nucleic acid sequences set forth in Figures 1 
and 2. These sequences may, advantageously be used as 
probes or primers to initiate replication, or the 

10 "^iike. Such nucleic acid sequences may be .produced- 

according to techniques well known in the art, such as 
by recombinant or synthetic means. They may also be 
used in diagnostic kits or the like for detecting the 
presence of a nucleic acid according to the invention. 

15 These tests generally comprise contacting the probe 
with the sample under hybridising conditions and 
"^"detecting for the presence of any duplex or triplex 
formation between the probe and any nucleic acid in 
the sample- 

20 

Letters utilised in the sequences according to the 
invention which are not recognisable as letters of the 
genetic code signify a position in the nucleic acid 
sequence where one or more of bases A, G, C or T can 
25 occupy the nucleotide position. Representative 

letters used to identify the range of bases which can 
be used are as follows: 



M: 


A 


or 


C 


R: 


A 


or 


G 


W: 


A 


or 


T 


S: 


C 


or 


G 


Y: 


C 


or 


T 


K: 


. G 


or 


T 
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V: A or C or G 

H: A or C or T 

D: A or G or T 

B: C or G or T 

5 N: G or A or T or C 

According to the present invention, degenerate primers 
were utilised to fully identify the sequence of the 
nucleic acid encoding the proteins of the invention. 
10 "Those novel molecules as described m the example^ ^ 
provided also form part of the present invention. 

According to the present invention these probes may be 
anchored to a solid support. Preferably, they are 

15 present on an array so that multiple probes can 
simultaneously hybridize to a single biological 
sample. The probes can be spotted onto the array or 
synthesised in situ on the array. (See Lockhart et 
al., Nature Biotechnology, vol. 14, December 1996 

20 ''Expression monitoring by hybridisation to high 

density oligonucleotide arrays'', A single array can 
contain more than 100, 500 or even 1,000 different 
probes in discrete locations. 

25 The nucleic acid sequences, according to the invention 
may be produced using such recombinant or synthetic 
means, such as for example using PGR cloning . 
mechanisms which generally involve making a pair of 
primers, which may be from approximately 10 to 50 

30 nucleotides to a region of the gene which is desired 
to be cloned, bringing the primers into contact with 
mRNA, cDNA, or genomic DNA from a suitable biological 
source, and in particular from {a cell of) a species 
of coral, more particularly from (a cell of) a species 
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of coral from the genus Polythoa and/or the genus 
Discosoma/ performing a polymerase chain reaction 
under conditions which brings about amplification of 
the desired region, isolating the amplified region or 
5 fragment and recovering the amplified DNA. Some of 
the primers suitable for the aforementioned method 
include, but are not limited to, the individual 
primers mentioned in Table 1 as well as the 
combinations thereof mentioned in Table 2. Generally, 

10 "^such techniques are well known in the art, such a^- 
described in Sambrook et al. (Molecular Cloning: a 
Laboratory Manual, 1989) . Another suitable technique 
involves "'gene shuffling" (DNA shuffling by random 
fragmentation and reassembly: In vitro recombination 

15 for molecular evolution: Proc. Natl. Acad. Sci. Vol 
91, pp 10747-10751, October 1994. 

Therefore, it is also envisaged that - based upon the 
disclosure herein and (for instance) using one or more 
of the primers listed in Table 1 or a suitable 
combination thereof (including but not limited to the ' 
combinations mentioned in Table 2 - the skilled person 
will be able to isolate (nucleic acids encoding) 
additional fluorescent proteins of the invention from 
other suitable biological sources, and in particular 
from other species of coral such as (other) species 
from the genus Polythoa and/or the genus Discosoma; 
and such (nucleic acids encoding such) additional 
fluorescent proteins are also within the scope of the 
present invention . 

In one preferred embodiment, such any nucleic acids 
will have at least 70%, preferably at least 80%, more 
preferably at least 90% and even more preferably at 



20 



25 



30 
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least 95% sequence identity with at least one of the 
nucleotide sequences depicted in Figure 1, in which 
the percentage sequence homology is determined as 
described above; and/or is capable of hybridizing with 
5 at least one of the nucleotide sequences depicted in 
Figure 1 under conditions of high stringency, again as 
described above. 

The term ''homologous" describes the relationship 
10 """"between different nucleic acid molecules or. amina^acid 
sequences wherein said sequences or molecules are 
related by partial identity or similarity at one or 
more blocks or regions within said molecules or 
sequences. Homology may be determined by means of 
15 computer programs known in the art. 

Substantial homology preferably carries with it that 
the nucleotide and amino acid sequences of the 
fluorescent proteins of the invention comprise a 

20 nucleotide and amino acid sequence fragment, 

respectively, corresponding and displaying a certain 
degree of sequence identity to the sequences set forth 
in Figures 1 and 2 for the nucleotide sequences and 3 
to 8 for the polypeptide sequences. Preferably they 

25 share an identity of at least 30 %, preferably 40 %, 

more preferably 50 %, still more preferably 60 %, most 
preferably 70%, and particularly an identity of at 
least 80 %, preferably more than 90 % and still more 
preferably more than 95 % is desired with respect to 

30 the nucleotide or amino acid sequences depicted in 
Figures 1 to 8 respectively. A preferred method for 
determining the best overall match between a query 
sequence (a sequence of the present invention) and a 
subject sequence, also referred to as a global 
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sequence alignment, can be determined using, for • 
example, the FASTDB computer program based on the 
algorithm of Brutlag et al. (Comp. App. Biosci. 6 
(1990), 237-245.) In a sequence alignment the query 
5 and subject sequences are both DNA sequences. An RNA 

sequence can be compared by converting U's to T's- The 
result of said global sequence alignment is in percent 
identity. Further programs that can be used in order 
to determine homology/ identity are described below and 

10 "^"in the examples. The sequences that are- homologo\is-to 
the sequences described above are, for example, 
variations of said sequences which represent 
modifications having the same biological function, in 
particular encoding proteins with the same or 

15 substantially the same receptor specificity, e.g. 

binding specificity. They may be naturally occurring 
variations, such as sequences from other mammals, or 
mutations. These mutations may occur naturally or may 
be obtained by mutagenesis techniques. The allelic 

20 variations may be naturally occurring allelic variants 
as well as synthetically produced or genetically 
engineered variants. In a preferred embodiment the 
sequences are derived from a hxoman. 

25 A further aspect of the invention provides host cells 
transformed or transfected with a vector according to 
the invention. Such cells can be of prokaryotic or 
eukaryotic origin. Suitable prokaryotes include gram 
positive or negative organisms including E.coli, 

30 Bacillus spp, Pseudomonas spp, or salmonella 

typhimurium. The expression vector used to transform 
the prokaryotic cells, and particularly E.coli, 
preferably comprises plasmids designated pGR3 and 
pGR7, the sequences of which are illustrated in Figure 
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9 and 13 respectively. Eukaryotic organisms include 
yeasts or fungi and plant cells which utilise a 
transfection system based on infection by 
Agrobacterium tumefaciens. 

5 

The vectors can also be used to transform cells in 
tissue culture in addition to non-human organisms and 
these also form part of the invention. Typical 
mammalian tissue culture cells include COS-7, HEK-293, 
10 • BHK, CHD, HELA cells and the like. Suitable organisms 
which may be useful to monitor expression of proteins 
using the novel fluorescing proteins of the invention 
include C.elegans, which is particularly advantageous 
as the fluorescing protein can be viewed in vivo. 

15 

When the organism to be transformed with the 
appropriate vector is C.elegans, the vector preferably 
comprises the sequence of the plasmid illustrated in 
Figure 12 or a vector adapted for expression of 
20 heterologous -proteins in the C.eJegans including the 
nucleotide sequences illustrated in Figure 2. 

Transformation of a host cell with recombinant DNA may 
be carried out by conventional techniques as are well 

25 known to those skilled in the art. Where the host is 
prokaryotic, such as E.coli, competent cells which are 
capable of- DNA uptake can be prepared from cells 
harvested after exponential growth phase and 
subsequently treated by the CaClj method by procedures 

30 well known in the art. Alternatively, MgClj or RbCl 
can be used. Transformation can also be performed 
after forming a protoplast of the host cell or by 
electroporation . 
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When the host is a eukaryote, such methods of 
transfection of DNA as calcium phosphate co- 
precipitates, conventional mechanical procedures such 
as microinjection, electroporation, insertion of a 
5 plasmid encased in lipsosomes, or virus vectors may be 
used. Eukaryotic cells can also be cotransfected with 
DNA sequences encoding the fusion polypeptide, such as 
the herpes simplex thymidine kinase gene. Another 
method is to use a eukaryotic viral vector, such as 
10 "^*^'simian virus 40(SV40) or bovine papilloma virus, -t© 
transiently infect or transform eukaryotic cells and 
express the proteins {Eukaryotic Viral Vectors, Cold 
Spring Harbour Laboratory, Gluznan ed. , 1982. 

15 Also encompassed within the scope of the present 
invention is a method of producing a polypeptide 
according to the invention comprising . cultivating a 
host cell or tissue transformed or transfected with 
the appropriate vector of the invention under 

20 conditions suitable for expression of the flourescent 
protein and optionally recovering the expressed 
protein. The protein may be recovered and purified 
from the recombinant cell cultures by methods known in 
the art, including ammoniiun sulfate or ethanol 

25 precipitation acid extraction, anion or cation 
exchange chromatography, phosphocellulose 
chromatography, hydrophobic interaction 
chromatography, affinity chromatography, 
hydroxyapatite chromatography and lectin 

3 0 chromatography . 

In a further aspect, the invention also comprises an 
oligonucleotide probe or primer, and which comprises a 
sequence that selectively hybridises to a nucleic acid 
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molecule according to the invention. The 
oligonucleotide preferably comprises a sequence of at 
least 10 contiguous nucleotides and is preferably 
between 10 and 50 nucleotides in length. 

5 

Advantageously, the novel proteins of the invention, 
as aforementioned, are particularly useful for 
monitoring expression of proteins within biological 
systems and the subcellular localisation or 

10 '''^trafficking of proteins. To determine the- express-ion 
pattern of a particular protein of interest it 
suffices in principle to make a fusion between the 
promoter of the gene of interest and the sequence 
encoding the fluorescing protein. Upon introduction 

15 of a vector with the promoter-fluorescent protein of 
the invention fusion into a cell or organism, any 
expression induced by the promoter can easily be 
monitored by following the expression of the protein 
of the invention. To monitor the subcellular 

20 expression of a protein it generally suffices to make 
a fusion between the protein of interest and the GFP 
protein, which can be done at either the N or C 
terminals of the protein. 

25 Therefore, in a further aspect the present invention 
comprises a method for selecting cells capable of 
expressing a protein of interest, comprising 
introducing into said cells a vector comprising the 
nucleotide sequence of a fluorescent protein a'ccording 

30 to the invention operatively linked to a promoter or 
regulatory region of the protein of interest, 
cultivating the cell under conditions necessary for 
expressing the protein of interest and monitoring for 
any fluorescent following expression of said 
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fluorescent protein. 

In accordance with the present invention, a protein of 
interest includes any protein to be monitored or 
5 labelled by virtue of being attached or expressed 
together with the proteins of the invention. The 
techniques for generating fusion proteins using the 
proteins of the invention are well known to those in 
the art, 

10 

A particular use of fluorescent proteins consists of 
the construction of a synthetic protein harboring a 
donor fluorescent protein and an acceptor fluorescent 
protein, connected with a binding protein moiety. The 

15 two fluorescent proteins change conformation upon 

binding of an analyte to the binding protein moiety. 
The binding protein moiety has an analyte-binding 
region which binds an analyte and causes the indicator 
to change conformation upon exposure to the analyte. 

20 The donor fluorescent protein is covalently coupled to 
the binding moiety. The acceptor fluorescent protein 
moiety is also covalently coupled to the binding 
protein moiety. In the fluorescent indicator the 
donor moiety and the acceptor moiety change position 

25 relative to each other when the analyte binds to the 
analyte binding region, altering fluorescence 
resonance energy transfer between the donor moiety and 
the acceptor moiety when the donor moiety is excited. 
Such a system has been described previously by Tsien 

30 et al. WO 98/40477 and Garman WO 94/28166. These 

molecules are very efficient in measuring internal 
concentrations of analytes such as cAMP, Ca^*, etc. as 
for measurement of internal enzymatic activities of 
enzymes such as proteases, esterases, etc. The novel 
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fluorescent proteins according to the present 
invention and functional equivalents, derivatives or 
fragments thereof can be used to develop new FRET 
molecules . 

5 

Therefore, in a further aspect the present invention 
comprises a method for producing fluorescence 
resonance energy transfer comprising; providing an 
acceptor molecule comprising a fluorescent protein 

10 according to the invention providing an appropriate 
donor molecule for the fluorescent protein; arid 
bringing the donor molecule and acceptor molecule into 
sufficiently close contact to allow fluorescence 
resonance energy transfer. Alternatively, the donor 

15 molecule can be the fluorescent protein of the 
invention in which case an appropriate acceptor 
molecule for the fluorescent protein is provided. 

The invention may be more clearly understood from the 
20 following description of an exemplary embodiment with 
reference to the accompanying Figures wherein: 



Figure 1 is an illustration of the nucleotide 

sequences encoding for fluorescent 
25 proteins from the Polythoa and ' 

Discosoma species of coral. 

Figure 2 (a) is an illustration of the sequence of 
the DNA fragment encoding Polythoa 2 
30 protein with optimal codon usage for 

expression in C.elegans. 
(b) is an illustration of the sequence from 
(a) further including introns and a 5' 
untranslated region containing a Kozak 
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sequence. 



Figure 3 (a-d) 



is an illustration of the polypeptide 
sequences of Polythoa 1 and 2 and 
Discosoma 1 and 2 encoded by the 
nucleic acid molecules of the 
invention. 



Figure 4 



10 



is an illustration of the sequence of a 
Polythoa fusion protein encoded by-*- 
plasmid pGR3 and which includes a 109 
amino acid thioredoxin fragment fused 
to the Polythoa 2 polypeptide sequence. 



15 



Figure 5 



is an illustration of the sequence of a 
Polythoa 2 fluorescent fusion protein 
in pGR7 which also incorporates the 109 
thioredoxin amino acid fragment. 



20 



Figure 6 



is an alignment of the proteins encoded 
by the plasmids indicated A- J therein. 



Figure 7 



25 



is a further alignment of the pjrotein 

sequences of the Polythoa 2, Discosoma 

1 hybrid and the proteins encoded by 

the plasmids indicated therein. 



30 



Figure 8 (a) is a further alignment of the 

translation products from the DNA 
fragments indicated therein. 



Figure 8 (b) is an alignment of some of the clones 
used in the present invention. 
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Figure 9 



is an illustration of the nucleotide 
sequence of plasmid pGR3- 



Figure 10 is an illustration of the nucleotide 

sequence of plasmid pGR4. 

Figure 11 is an illustration of the nucleotide 

sequence of plasmid pGR5- 



10 Figure 12 



is an illustration of the nucleotide 
sequence of plasmid pGR6. 



15 



Figure 13 is an illustration of the nucleotide 

sequence of plasmid pGR7. 

Figure 14 is an illustration of the nucleotide 

sequence of plasmid pDW2700. 



Figure 15 is a graphic representation of the 

20 emission spectrum of the thioredoxin- 

FP-fusion protein from pGR3 at (a) 452 
nm and (b) 489 nm excitation. 

Figure 16 is a graphic representation of the 

25 emission spectrum of thioredoxin-FP- 

fusion protein from pGR3 at 4 69 nm 
excitation. 



Figure 17 is a graphic representation of the pGR3 

30 excitation spectrum at an emission of 

490 nm. 



Figure 18 



is a graphic representation of the 
excitation spectrum of thioredoxin-FP- 
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fusion protein from pGR7 at 490 nm 
emission. 

Figure 19 is a graphic representation of the 

emission spectrum of thioredoxin-FP- 
fusion protein from pGR7 at 452 nm 
excitation. 



10 



Figure 20 



illustrates combined emission and 
excitation spectra of thioredoxin-vFF- 
fusion protein from pGR7. 



15 



Figure 21 is a graphic representation of the 

emission spectrum of thioredoxin-FP- 
fusion protein from pGR13 at 452 nm 
excitation. 



20 



Figure 22 is a graphic representation of the 

emission spectrum of thioredoxin-FP- 
fusion protein of pGR13 at 469 nm 
excitation. 



25 



Figure 23 is a graphic representation of the 

excitation spectrum of the thioredoxin- 
FP- Fusion proteins from pGRl3 at 4 90 nm 
emission. 



30 



Figure 24 is a graphic representation of the 

emission spectrum of thioredoxin-FP- 
fusion protein pGR15 at (a) 489 nm 
excitation and (b) 451 nm excitation. 



Figure 25 



is a graphic representation of the 
emission spectrum of thioredoxin~FP- 
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fusion protein pGR15 at 440 nm 
excitation. 

Figure 26 is a graphic representation of the 

5 emission spectrum of thioredoxin-FP- 

fusion protein pGF15 at 440 nm 
excitation. 

Figure 27 is a list of the clones used in 

10 accordance with the invention. 



15 



20 



Figure 28 is a list of pertinent absorbance and 

emission values for some of the clones 
used. 

Examples : 

1) Isolation of cDNA encoding for new fluorescent 

proteins 

a) Isolation of RNA 



Two brightly fluorescent Anthozoa species 
(Polythoa and Discosoma species) were used to isolate 

25 fluorescent proteins. This type of coral 'can be 

obtained from aquarium supply outlets, but such corals 
can be obtained from various coral reefs. The corals, 
and more particularly the polyps expressing high 
levels of fluorescent protein were flash-frozen in 

30 liquid nitrogen. Methods to isolate material samples 
are common in molecular biology techniques, and have 
been described in '^Current Protocols in Molecular 
Biology", ed, by AusxJDel et al., John Wiley & Sons, 
Inc. 
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Total RNA was isolated using TRIzol™ Reagent 
(Cat. NO. 15596; Life Technologies), according to the 
manufacturers procedure, from the frozen specimens and 
the total RNA was finally re-suspended in DEPC water 
5 (Current protocols in Molecular biology, ibid) . 

b) First strand cDNA synthesis 

First strand cDNA was prepared using the total 
10 ""^RNA isolations as described above from the Polyth©a or 
the Discosoma species. Random primers were provided by 
Life Technologies (Cat. NO. 48190-11) and cDNA was 
synthesized using the Superscript II kit (Cat. NO. 
18064-022; Life Technologies) . The protocol to 
15 generate cDNA, by RT-PCR was performed according the 
instructions of the manufacturers. 

c) PCR with degenerate primers: 

To isolate full cDNA sequences encoding for new 
20 fluorescent proteins, a series of PCR procedures were 

performed using the cDNA isolated as described above. 

For these experiments, the following synthetic: 

degenerate primers were used: 

oGRl: CACCACATGGTU^GGAWRYKTNRAYGG; 
25 oGR2: ACCACATGGAAGGATGCKTNRAYGGNCA; 

oGR3: AATTTGTGATCAAGGGCRARGGNRWNGG; 

oGR4 zGTGATCAAAGGTGGACCNYTNCCNTT; 

oGR5 : GACATATTGTCAACAGAGTTYMANTAYG; 

oGR6 : CATATTGTCAACAGAGTTYMANTAYGG; 
3 0 oGR7 : ATCCTGACGACATACCAGAYTAYHWNAA; 

oGR8 : GACTATTTCAAGCAGTCGTKYCCNGMNGG; 

oGR9 : CATGGGAAAGGTCCTTGCAYTWYGARGA; 

oGRlO : GGTGACATCTCCTTTCARNAYNCC; 

oGRll : CATATTCTCAGTGGANGSNTCCCA; 
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OGR12 : CACAGGTCCATCGSNAGGRAARTT; 
OGR13 : CCATCGGCAGGAAARTTNANNCC; 
OGR14: TGAATACCCTGTTTCCRTANTKRAA 

5 The first strand cDNAs as isolated above were 

subjected to PGR amplification using the set of 
degenerate primers (oGRl till oGR14) and Amplitaq Gold 
(Perkin Elmer) as Polymerase. Concentrations, buffers 
were as provided by the manufacture or minor 
10 modifications were applied as known in the art. 
The PGR conditions were as followed: 

An initial denaturation step at 95''C for 10', followed 
by 25 cycles of touch down PGR (30'' at 95 °C, 1' at 
55°C (-0.2^C/cycle) and 1' at 72°C) and followed by 15 

15 cycles of PGR (95°C for 30", at 50°C for 1' and 72°C 
for 1')- The resulting PGR products were analyzed on 
standard agarose gel and the DNA fragments of interest 
were isolated and cloned into vector pCR-XL-TOPO 
vector (Cat. NO. K4700-20; Invitrogen) . 

20 Following primer combinations resulted in the 
isolation of appropriate DNA fragments 
On Polythoa first strand cDNA: 

0GRI/0GRI4, 0GR6/0GRII, 0GR2/0GRII, 0GR3/0GRII, 
0GR4/0GRII, oGRS/oGRll, oGRl/oGRll, 
25 on Discosoma first strand cDNA: 

OGRI/OGRIO, oGRl/oGRll, 0GR6/0GRIO, 0GR6/0GRII, 
0GR2/0GRII 

0GRI/0GRI2, 0GRI/0GRI4, 0GR6/0GRI2, 0GR6/0GRI3, oGR3/ 
oGRll, 0GR4/0GRII, 0GR5/0GRII, 0GR8/0GRII, 0GR9/0GRII 

30 

It would be apparent to a person skilled in the art 
that other primer combinations will also result in the 
isolation of DNA fragments encoding for fluorescent 
proteins, such as the primer combinations. oGRl/oGRl3, 
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oGR2/oGRlO, oGR2/oGR12, oGR2/oGR13, oGR2/oGR14, 

oGR3/oGRlO, oGR3/oGR12, oGR3/oGR13, oGR3/oGR14, 

oGR4/oGRlO, oGR4/oGR12, oGR4/oGR13, oGR4/oGR14, 

oGR5/oGR10, oGR5/oGR12, oGR5/oGR13, oGR5/oGR14, 

5 oGR7/oGRlO, oGR7/oGRll, oGR7/oGR12, oGR7/oGR13, . 

oGRB/oGRlO, 0GR8/0GRI2, 0GR8/0GRI3, 0GR9/0GRIO, 
0GR9/0GRI2 , oGR9/oGR13 . 

c) establishing bona fide sequences. 

10 ■ 

After initial sequencing of the cloned DNA 
fragments, more specific primers were designed to 
isolate the relevant cDNA from the two species. 
For the Polythoa species: 

15 OGR21 : AAAGGCGTGCCCCTTCCTTTCGCTTTCGA; 
oGR22 : TGTCAACAGCATTCCAGTATGGCAACAGGGTA; 
'oGR23 : TGAAGAGGGCGTTTGCACCACAAAGAGTG; 
oGR2 4 : AAAGGGGAGAAGCTTGACCCCAACGGCC ; 
OGR25 : TTGAAAGCAGTCTGGTTGGCCTTTCTTGA; 

20 oGR2 6 : TGTGGTGCAAACGCCCTCTTCATATTTGAA; 
OGR27 : CCCTGTTGCCATACTGGAATGCTGTTGAC ; 
OGR28 : AAGGAAGGGGCACGCCTTTAGTGACTGTAAG 
OGR29 : CTTGCCTTGTCCCTCTCCCGTGATCGTGA; 
For the Discosoma species: 

2 5 oGR3 9 : GGAGTU^GGAGAAGGAAAACCATACGAGGG ; 

oGR4 0 : CCAGTACGGCAACAGGGCATTCACCAAAT ; 
oGR4 1 : GGGAAAGAACCATGAATTTTGAAGACGGG ; 
OGR42 : CCCCCCATTGGCCCAGTTATGCAGAAGAA/ 
OGR43 : GCC2\ATGGGGGGAAAGTTCGCACCATCAA; 

3 0 oGR4 4 : CGCCCCCGTCTTCAAAATTCATGGTTCTT : 

oGR4 5 : CCTGTTGCCGTACTGGAACGCTGTTGTCA; 
oGR4 6 : TGGGAAGTCTTATGATGGCACCAATACCG ; 
oGR4 7 : TTCAGGT AACCAAGGGTGGACCTCTGCCA ; 
oGR4 8 : TGTCAGGCATCCGGAAGACATCGCTGATT : 
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oGR4 9 : CATGCACTTTGAAGACGGTGGCGTGTGTT; 
OGR50 : TCATTGGTGATACAACACACGCCACCGTC; 
OGR51 : CATGACCCTTTCCCATGTAAATCCTTCGGGA; 
OGR52 : TTGTGGTGACAAAATAGGCCAAGCAAATGGC; 
OGR53 : GAAATAAAAGGCGACGGTCACGGGAAGCC; 
OGR54 : CATGGTAACCAAGGGTGGACCCCTGCCAT; 
OGR55 : AAANCTGTCGTTTCCCGAGGGATTTACAT; 
oGR5 6 : TGGCGTGATTTGCAGCNCCAATGATATCA; 
OGR57 : CGCCACCGNCTTCAAAGTGCATGACCCTT; 
' oGR5 8 : ANCGGCTATGTCTTCAGGGTGCTTGACAA 
oGR5 9 : GGTCCACCCTTGGTTACCATGAGCTTGACGTT . 



Following primer combinations are to be envisaged: 
oGR21/oGR20, oGR22/oGR20, oGR23/oGR20, oGR24/oGR20, 

15 oGR25/oGR30/OGR31, oGR26/oGR30/OGR31, 
oGR27/oGR30/OGR31, oGR28/oGR30/OGR31, 
oGR29/oGR30/OGR31, oGR25/oGR16, oGR25/oGR18, 
oGR26/oGRl6, oGR26/oGR18, oGR27/oGR16, oGR27/oGR18, 
oGR28/oGRl6, oGR28/oGR18, oGR29/oGR16, oGR29/oGR18, 

20 oGR39/oGR20, oGR40/oGR20, oGR41/oGR20, oGR42/oGR20, 
OGR4 3 /oGR3 0 /0GR3 1 , oGR4 4 /oGR3 0 /0GR3 1 , 
oGR45/oGR30/OGR31, oGR43/oGR16, oGR43/oGR18, 
oGR44/oGR16, oGR44/oGRl8, oGR45/oGR16, oGR45/oGR18 
oGR46/oGR20, oGR47/oGR20, oGR48/oGR20, oGR49/oGR20, 

25 OGR50/OGR30/OGR31, oGR51/oGR30/OGR31, 

oGR52/oGR30/OGR31, oGR50/oGR16, oGR50/oGR18, 
oGR51/oGRl6, oGR51/oGR18, oGR52/oGR16, oGR52/oGR18, 
oGR53/oGR20, oGR54/oGR20, oGR55/oGR20, oGR56/oGR20, 
oGR5 7 /oGR3 0 /0GR3 1 , oGR5 8 /oGR3 0 /0GR31 , 

30 oGR59/oGR30/OGR31, oGR57/oGR16, oGR57/oGR18, 

oGR58/oGRl6, oGR58/oGR18, oGR59/oGRl6, oGR59/cGR18 



d) 3' and 5' RACE experiments 
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To clone the full length cDNA encoding for the 
fluorescent proteins of the Polythoa species and the 
Discosoma species, 3' and 5' RACE experiments were 
performed. To facilitate these experiments additional 
5 cDNA was prepared. Starting from the RNA isolations as 
described above, new first strand cDNA synthesis was 
performed using the SMART PCR cDNA Synthesis Kit (Cat. 
NO. K1052-1; Clontech) , 3' RACE PCR, was performed 
according to the manufacturers instructions of the 

10 '"^"SMART PCR cDNA Synthesis Kit. The 5' RACE ends of-^^fehe 
cDNA fragments were amplified using a step-out RACE 
strategy (Matz, M. et al. Amplification of cDNA ends 
based on template-switching effect and step-out PCR. 
Nucleic Acids Res. 27, 1558-1560 (1990)), or according 

15 to the manufacturers instructions of the SMART PCR 
cDNA Synthesis Kit. 

The 3' ends of the Polythoa species were 
amplified in primary PCR reactions with the primer 
combinations oGRl-oGR20 and oGR2-oGR20. A sample of 

20 the primary PCR reaction was used as a template in 

nested PCR reactions using primer combinations oGR2- 
oGR20and oGR3"OGR20 respectively 

The 3' ends of the Discosoma species were 
amplified in primary PCR reactions with the specific 

25 primer combination oGR39/oGR20 after which a nested 

PCR was performed with primer combinations oGR40/oGR20 
or oGR41/oGR20 or oGR42/oGR20. Primary PCR with 
primers combination oGR41/oGR20 was nested with 
oGR42/oGR20, and primary PCR reaction with primer 

30 combination oGR47/oGR20 was nested with primer 

combination oGR49/oGR20. Finally PCR reaction with 
primer combination oGR41/oGR20 was nested with 
oGR42/oGR20 
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The primary PGR conditions were: 1' at 94 °C, 30 PGR 
cycles (30" at 94**G, 1' at 55**G and 5' at eS^'G) 
followed by 5' at 6B''C 

The PGR conditions of this nested PGR were as 
5 followed: 1' at 94 ^'G followed by 35 cycles (30" at 
94*'G, 1' at 55**G and 5' at 72'*G) and 5' at 72'*G. 

The 5' ends of the Polythoa species were 
amplified in primary PCRs with the specific 5' RAGE 
primers combinations: oGR16/oGR28, oGR16/oGR25, 
10 "^"oGR16/oGR26, oGR16/oGR27, oGR16/oGR28 and oGR16/aGR29. 
The following PGR conditions were used: 1' at 94 °G, 20 
PGR cycles (30" at 94°C, I'SC' at 72°G (- 
0.2°C/cycle) ) , 20 PGR cycles (30" at 94**G .and 1' 30'' at 
eS^'G) followed by 5' at SS^'G. 

15 

The 5* ends encoding for the Discosoma species 
fluorescent proteins were amplified according to the 
Step-Out PGR protocol as mentioned above. Primary PGRs 
with 5' RAGE primers combinations oGR10/oGR30/oGR31 
20 and nested with primers combinations oGRll/oGR30/oGR31 
was performed . 

Other primary PGR/ nested PGR combinations were: 
oGRll/oGR30/oGR31, nested with oGR12/oGR30/oGR31^ 
oGR12/oGR30/oGR31, nested with oGR13/oGR30/oGR31, 
25 oGR13/oGR30/oGR31, nested with oGR14/oGR30/oGR31, 
oGR43/oGR30/oGR31, nested with oGR44/oGR31 or 
oGR45/oGR31, 

oGR44/oGR30/oGR31, nested with oGR45/oGR31, 
oGR50/oGR30/oGR31, nested with oGR51/oGR31 or 
30 oGR52/oGR31, 

oGR51/oGR30/oGR31, nested with oGR52/oGR31, 

oGR52/oGR30/oGR31 

oGR59/oGR30/oGR31 
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The primary and nested PGR conditions were: 1' at 
94**C, 35 cycles of PGR (30" at 94 ^'G, 1' at 55°G and 2' 
at 72"C) followed by 5' at 72°G. 
The 5' ends of the Discosoma species were also 
5 amplified using specific 5' RAGE primers combinations 
oGR43/oGR16, oGR43/oGRl8, oGR44/oGR16, oGR44/oGR18, 
oGR45/oGR16, oGR45/oGR18, oGR50/oGR16, oGR50/oGRl8, 
oGR51/oGR16, oGR51/oGRl8, oGR52/oGR16, oGR52/oGRi8 and 
oGR59/oGR16, oGR59/oGR18. 
10 "^'^The PGR conditions were an initial denaturatidn of-1' 
at 94 ''C, followed by 20 cycles of touch down PGR (30" 
at 94°G, V at 72^G (-0 . 2^G/cycle) ) , followed by 20 
cycles of PGR (30" at 94''G and 1' at SS^'G) and 5' at 
68^C. 

15 

All the resulting PGR products of the 3' and 5' RAGE 
PGRs were analyzed on agarose gel and the appropriate 
DNA bands of interest were isolated and cloned into 
the pCR-XL-TOPO vector (Cat. NO. K4700-20; Invitrogen) 
20 and further analyzed by sequence analysis. 

Primers oGR20, oGRl6, 0GRI8, oGR30, oGR31 were 
provided by the manufacturers and encode for. : 
oGR2 0 : GTAATAGGAGTGAGTATAGGGGGGGAGTGGAGGGTTTTTTTTTTTTT 

25 

0GRI6 AAGGAGTGGTATGAAGGCAGAGT 
0GRI8 : AAGGAGTGGTAAGAAGGGAGAGT 

OGR30 : GTAATAGGAGTGAGTATAGGGGAAGGAGTGGTATGAAGGGAGAGT 
OGR31: GTAATAGGAGTGAGTATAGGGG 

30 

e) Gloning of full size cDNA encoding for 
fluorescent proteins from Anthozoa species. 
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All cloning experiments were performed using 
standard protocols as provided by the manufacturers or 
as described by Ausubel et al. in Current Protocols in 
Molecular biology, ibid. Isolation of full length 
5 cDNA's was also performed using the Titan One tube RT 
PGR System (Boeringer Mannheim) The reactions were 
performed according to the manufacturers instructions. 

i) Cloning of full size Polythoa 1 GFP cDNA 
10 ^ • w 

PGR was performed using specific primer 
combinations oGR32/oGR34, oGR32/oGR35, oGR33/oGR34 and 
oGR33/oGR35, and other primer combinations as 
described above. The resulting fragments were isolated 
15 and cloned in appropriate vectors, mainly the *pCR-XL- 
TOPO vector. 

The resulting plasmid was designated pGR22 (using 
primer combination oGR33: 
20 CTTGGTGATTTGGGAGAAGGCAGATCGAG and oGR34: 
CGTCTTGGCTTTTCGTTAAGCCTTTACTGGGG ) . 

Polythoa 1 GFP cDNA was amplified by PCR using plasmid 
DNA pGR22 as template and the primers: 0GR68: 
CTGGAATTCTATTACTTTGAGTCTACCATCATGAGTGCAATT and OGR72: 
25 CGTATCTCGAGCGTCTTGGCTTTTCGTTAAGCCTTTACTGGGG . The 

resulting PCR products were analyzied by agarose gel 
electrophoresis and the DNA of interest was isolated 
and cloned into the pCR-XL-TOPO vector. The resulting 
plasmid was designated pGR26. 

30 

ii) Cloning of full size Polythoa 2 GFP cDNA: 

To isolate the full size cDNA clone of the 
Polythoa species (here designated Polythoa 2), the 
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Titan One Tube RT-PCR System (Cat. NO. 1888382, 
Boehringer Mannheim) was used- The reactions were 
performed according to the manufacturers procedure, 
using specific primers pGR32 till oGR38. More 
5 particularly the following primer combinations were 
successful: 

oGR32/oGR34, oGR32/oGR35, oGR33/oGR34, oGR33/oGR35, 
oGR36/oGR37 and oGR36/oGR38. 

10 '^oGR32: ACCTTGGTGATTTGGGAGAAGGCAGATCGAGAG; 

OGR33: CTTGGTGATTTGGGAGAAGGCAGATCGAG; 

OGR34 : CGTCTTGGCTTTTCGTTAAGCCTTTACTGGGG; 

oGR35 : GAGAAACTTCTTTTTCACTTTGTTGTCGTCTTG; 

oGR3 6 : GACACTGGTGATTTGGGAGAAGGCAGATC; 
15 OGR37: ATTGCGAGCCACGGCAACTTCATACAGC; 

OGR38: GCCATAATCTGAAGAGGAGAATTGCGAGCCAC) • 

The resulting PGR products were analyzed by agarose 
gel electrophoresis and the DNA of interest was 
20 isolated and cloned into the pCR-XL-TOPO vector. The 

resulting plasmids were designated pGRl (using primers 
combination oGR32/oGR34) and pGR8 (using primers 
combination oGR36/oGR38) 



25 iii) Cloning of full size Discosoma 1 GFP cDNA 

As in the previous experiments, specific primers 
were designed based upon the available sequence 
information resulting from earlier PGR reactions and 
30 3' and 5 ^ RACE PGR experiments. The isolation of a 
full length cDNA is analogous as described above. 



iv) Cloning of full size Discosoma 2 GFP cDNA 
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As in the previous experiments, specific primers 
were designed based upon the available sequence 
information resulting from earlier PGR reactions and 
3' and 5 ^ RACE PGR experiments. The isolation of a 
5 full length cDNA is analogous as described above. 

2) Cloning of new fluorescent proteins cDNA in 
expression vectors 

10 a) Cloning of Polytho2 GFP cDNA in prokaryortl'c 

expression vector: 

Polythoa 2 GFP cDNA was amplified by PGR using 
plasmid DNA pGRl as template and the primers: 
15 oGR6 9 : CTGGAATTGTGTACGGTGATGAGTGGAATTAAAGGAGTGA and 
OGR70 : CGTATGTGGAGATTGGGAGGGAGGGGAAGTTGATAGAGG. 
or by using plasmid DNA pGR8 as template and the 
primers 0GR68: 

GTGGAATTGTATTAGTTTGAGTGTAGGATGATGAGTGGAATT and oGR72: 
20 GGTATGTGGAGCGTGTTGGCTTTTGGTTAAGGGTTTAGTGGGG . 

The PGR product was purified and digested with the 
restriction enzymes EcoRI and Xhol and cloned in 
EcoRI/XhoI cloning sites of the expression vector 

25 pET32A (Cat. NO, 69015-3; Nova gen ) , the resulting 

vectors were designated pGR3, and pGR7 respectively. 
The resulting expression in E.coli resulted in visual 
observation of the fluorescent protein, without 
induction or UV treatment indicating high expression 

30 levels or a fluorescent protein with a high emission 
amplitude. 

b) Cloning of Polytho2 in eukaryotic expression 
vector: 
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Polythoa 2 cDNA was anqplified by PGR using 
plasmid DNA pGR8 as template, and the primers . 
combinations oGR69/oGR70 or oGR69/oGR71: 
OGR69 : CTGGAATTCTCTACCGTCATGAGTGCAATTAAACCAGTCA OGR70 : 
5 CGTATCTCGAGATTGCGAGCCACGGCAACTTCATACAGC . 

oGR7 1 : CGTATCTCGAGGCCATAATCTGAAGAGGAGAATTGCGAGCCAC 
The PGR product was purified and digested with the 
restriction enzymes EcoRI and Xhol and cloned in 
EcoRI/XhoI cloning sites of the expression vector 
10 '"^pCDNA3 (Invitrogen) , the resulting vectors were -re- 
designated pGR4 and pGR5 respectively - 

c) Cloning of Polytho2 in C. elegans expression 
vector : 

15 

Polythoa 2 cDNA was amplified by PGR using 
plasmid DNA pGRl as template, and the primers: 
OGR74 : CGTGGGCGGGGGACGAGGATGAGTGGAATTAAGCCAGTTATGAA 

and OGR72: 

2 0 CGTATCTCGAGGGTGTTGGCTTTTCGTTAAGGGTTTACTGGGG . 

The PGR product was purified and digested with the 
restriction enzymes EcoRI and Xhol and cloned in 
EcoRI/XhoI cloning sites of the expression vector 
pDW2700, the resulting vector was designated pGR6 . 

25 

d) Cloning of Polythoa 1 GFP cDNA in prokaryotic 
expression vector; 

An 752bp BcoRI/XhoI fragment of pGR26 was isolated, 
purified and ligated into the EcoRI/XhoI cloning sites 
30 of the expression vector pCDNAS (Invitrogen) . .The 

resulting vector was designated pGR24- The resulting 
expression in GOSI cells resulted in visual 
observation of the flurescent protein, after UV 
treatment • 
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e) Cloning of Polythoa 1 GFP cDNA in eukaryotic 
expression vector: 

An 752bp EcoRI/XhoI fragment of pGR26 was isolated, 
purified and ligated into the EcoRI/XhoI cloning sites 
5 of the expression vector pET32A (Cat. NO. 69015-3; 

Novagen) . The resulting vector was designated pGR25. 

3) Expression of new fluorescent proteins. 

10 a) expression of Polythoa 2 GFP in E. colir^- 

Expression in E.coli was performed according the 
instructions of the pET32A provider (Novagen) . Both 
the plasmids pGR3 and pGR7 resulted in clear 
15 expression in E. coli. 

b) expression of Polythoa 2 GFP in Mammalian 

cells 

20 COS I : African green monkey kidney cell line, 
standardly cultured in DMEM with Na-pyruvate 
supplemented with 10% fetal calf serum (Life • 
Technologies) and antibiotics (Pen/Strep; Life 
Technologies), was transfected with pGR4. 

25 The cells were seeded at a concentration of 1.5 x 10^ 
cells/well in 24-well plate and 7.5 x 10* cells/well 
in 1 well coverglass and trandsfected the day after 
with Lipof ectamine Plus reagent (GibcoBRL 10964-013) , 
according to the manufacturers instructions. 

30 The following day, the cells where washed twice with 
PBS (Life Technologies), and complete medium (1ml for 
24-well, 3ml for coverglass) was added. Fluorescence 
of the cells after 24 hours was observed by using UV- 
light of the microscoop with filter 450-4 90 
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(FT510 ;LP520) . Both the plasmids pGR4 and pGR5 
resulted in clear expression in Cos I cells 



c) 



Expression of Polythoa 2 GFP in C. elegans. 



10 



15 



20 



25 



C. elegans wild-type strain was transformed with pGR6 
using microinjection techniques known in the art, and 
described in Methods in Cell Biology, Vol48: C. 
elegans. Modern biological analysis off an organism, 
'^'^ed. by Epstein and Shakes. pGR6 resulted in clear^^^ 
expression of GFP in elegans. 

4) Mutant fluorescent proteins 

To further improve the characteristics of the isolated 
mutant fluorescent proteins, mutagenesis experiments 
were performed. Improvements of the fluorescent 
proteins can be of different nature, such as improved 
absorption spectra, improved emissions spectra, 
enhancement of the chromophore, etc. 

Site directed mutagenesis can be performed as 
described in Current protocols in Molecular Biology, 
ed by Ausubel et al, or as provided in the by the 
QuickChange Site-Directed Mutagenesis Kit (Stratagene, 
CA, USA) or by related methods as known in the art. 
Random mutagenesis, and more particularly molecular 
evolution techniques can be performed as described by 
Kunchner and Arnold, 1997, tibtech 15:523-530/ 
Stemmer, 1994, Nature 370:389-391; Stemmer, 1994, 
Proc. Natl. Acad. Sci. USA 91:10747-10751, or by 
related methods as known in the art. 
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During the cloning of the full length cDNA's in the 
vectors using PGR technology, mutant fluorescent 
proteins were created. More particularly the plasmids 
pGR3, pGR4, pGR5/ and pGR8 contain a mutant Polythoa 2 
N41D GFP, while plasmid pGR7 expresses a Polythoa 2 
Q136R GFP mutant and pGRlO is expresses a I106T 
mutant. The expression experiments described above 
clearly indicate that mutations introduced in the 
newly isolated fluorescent proteins, conserves the 
basic fluorescence property of the protein.. 

Back mutating towards natural occurring GFP 

The mutation Q136R in pGR7 was remutagenised towards 
the natural occurring Polythoa 2 FP using the 
QuikChange Site Directed Mutagenesis Kit and the 
primers 

OGR90 : GACCCCAACGGCCCAATTATGCAGAAGAAGACCCTGAAATGGGAG 
and OGR91: 

CTCCCATTTCAGGGTCTTCTTCTGCATAATTGGGCCGTTGGGGTC . The 
resulting vector was designated pGRlS 

5> Construction of a Polvthoa 2-discosonia 1 hybrid 



a) Cloning of a Polythoa 2-discosoma 1 hybrid GFP 
cDNA in prokaryotic expression vector: 

The 3/ end of the Discosoma species was amplified 
in primary PGR reaction with the specific primer 
combination oGR39/oGR20 as mentioned above (see l)d). 
The resulting PGR products were analyzed on agarose 
gel and the appropriate DNA band of interest was 
isolated and cloned into the pCR-XL-TOPO vector (Cat- 



WO0242323 fhttp:/AAWW,gettheDatent.com/Login.do g/^ 




ik99/FetchWO0242323.cpc?fromCache=1part=maj j 



[ bap=bottom1 



Page 46 of 96 



WO 02/42323 



PCT/EPOl/13604 



- 45 - 



10 



15 



20 



25 



NO- K4700-20; Invitrogen) . The resulting vector was 
designated pGR17. Plasmid DNA of pGR17 was digested 
with the restriction enzymes EcoRV and StuI and 
analyzed on agarose gel. The appropriate band of 525bp 
was isolated and cloned into the 3736 bp EcoRV 
fragment of pGRl. The resulting vector was designated 
pGR14, The resulting expression in E.coli resulted in 
visual observation of the fluorescent protein,, after 
UV treatment. An 124bp EcoRI-Hindlll fragment of 
pCDNA3. 1/hisA ( Invitrogen) was isolated, purified-and 
ligated into the 4212bp EcoRI-Hindlll fragment of 
pGR14. The resulting vector was designated pGRlS. The 
resulting expression in E.coli resulted in visual 
observation of the fluorescent protein, after UV 
treatment . 

b) Cloning of a Polythoa 2-Discosoma 1 hybrid GFP 
cDNA in eukaryotic expression vector: 

Polythoa 2 - Discosoma 1 hybrid cDNA was 
amplified by PGR using plasmid DNA pGR14 as a template 
and the primers: oGR69: 

CTGGAATTCTCTACCGTCATGAGTGCAATTAAACCAGTCA and oGR96: 
CGTACCTCGAGCCTTTACTTGGTCAGCCGGCTCGGCAGCTTGG . The PGR 
product was purified and cloned in the cloning vector 
pCR-XL-TOPO. ) . The resulting vector was designated 
pGR19. The 705 bp EcoRI/XhoI fragment of pGR19 was 
isolated, purified and cloned in EcoRI/XhoI cloning 
sites of the expression vector pCDNA3 (Invitrogen) ) . 
The resulting vector was designated pGR18. The 
resulting expression in COSI cells resulted in visual 
observation of the fluorescent protein, after UV 
treatment . 
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c) Cloning of a Polythoa 2-'discosoma 1 hybrid GFP 
cDNA in C. elegans expression vector: 

Polythoa 2 - Discosoma 1 hybrid cDNA was 
amplified by PCR using plasmid DNA pGR14 as template, 
and the primer combination oGR75: 

CGTCGGCGCGCCATCATGAGTGCAATTAAACCAGTCATGAAGAT and 
oGR9 6 : CGTACCTCGAGCCTTTACTTGGTCAGCCGGCTCGGCAGCTTGG • 
The PCR product was purified and cloned in the cloning 
vector pCR-XL-TOPO, The resulting vector was 
designated pGR21. The 700 bp Ascl/Xhol fragment of 
pGR21 was isolated, purified and cloned in the 
Ascl/Xhol cloning site of the expression vector 
pDW2700. The resulting vector was designated pGR20. 
The resulting expression in C. elegans resulted in 
visual observation of the fluorescent protein, after 
UV treatment. 

6) Establishing the excitation and em -tsgion spectra 
of the new green fluorescent proteins 

Isolation of protein from Polythoa 2 GFP, Polythoa2 
N41D GFP and Polythoa 2-discosoma 2 fusion GFP. 

The fluorescent proteins were expressed in E. coli 
from vector pGR3 (N41D) , pGR7(Q136R) , pGR13 (back- 
mutation, natural occurring Polythoa 2 FP) , pGR15 
(Polythoa-discosoma hybrid protein) and purified using 
the BugBuster Protein Extraption Reagent (Cat. NO.: 
70584-3; Novagen) and the His-Bind Buffer Kit (Cat. 
NO,: 697 55-3/ Novagen) according to the manufacturers 
instructions. 

The excitation and emission spectra of the samples 
were then determined. All samples were excited at 
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490nm. The spectra were corrected for 
photomultiplierresponse and monochromator 
transmittance, transformed to wave number and 
integrated • All experiments were performed in a Amico 
5 Bowman Series 2 Luminescence spectrometer (SLM-Amico 
Spectronic instruments) 

1> Synthetic Polvthoa 2 Fluorescent protein with 
ftpfTiimal Go don usage for C. eleaans. 

10 

To enhance the performance of the fluorescent proteins 
in organisms other than the Cnidaria species from 
which these fluorescent proteins were isolated/ the 
codon usage was altered- Although the genetic code is 

15 considered to be universal, every organism has its 
preferred codon usage, which is related to the 
presence and the expression of tRNA genes, and hence 
is involved in post-transcriptional expression 
regulation. Such optimal codon usage has been * 

20 determined for many organisms, including E.coli (Dong 
et al., 1996, J. Mol. Biol. 260:649-663), B. subtilis 
(Kanaya et al., 1999, Gene 238:143-155), Drosophila 
(Moriyama et al., 1997, J. Mol. Evol . 45:514-523) 
Saccharomyces ( Percudani et al., 1997, J. Mol 

25 Biol. 268:322-330) , C. elegans (Stenico, et al., 1994, 
NAR 22:2437-2446). An overview of codon usage in these 
and other organisms can be found in Duret et al., 
1999, Proc. Natl. Acad. Sci. U.S.A. 96: 4482-4487 and 
in Ikemura, 1985, Mol. Biol. Evol. 2:13-43. 

30 

The synthetic 922 bp gene was amplified using 
herculase-polymerase at Entechelon, Germany and was 
delivered as a ligation product. This product was 
cloned into pCR-XL-^TOPO (pGRl6) . The 888bp Fsel-Nhel 
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fragment of pGR16 was cloned into the Fsel/Nhel 
cloning sites of the expression vector pDW2721 and the 
resulting vector was designated pGRlO. This plasmid 
was injected in C. elegans^ and clearly resulted in 
fluorescence 

2> Synthetic introns in worm construct: 

In many organisms, such as in C.elegans, the 
introduction of synthetic introns results in 
enhancements of expression levels (Fire et al. , 1990 r 
Gene 93:189-98, end references therein). 
An example is hereby included of a Polythoa 2 
fluorescent protein improved for optimal codon usage 
for C. elegans and with synthetic C.elegans introns. 
Such synthetic genes can be made easily by a person 
skilled in the art, or be ordered by companies such as 
Entelechon, Rgensburg, Germany. 

Fusion proteins 

GFP proteins have been used for many purposes in 
biological research. The main use nevertheless has 
been the expression pattern of proteins in cells and 
multi-cellular organisms, and the subcellular 
localization or trafficking of proteins. To determine 
the expression pattern of a protein using GFP's it 
suffices in principle to make a fusion between the 
promoter of the gene of interest and the GFP. Upon 
introducing a vector with this promoter GFP fusion 
into the studied cell or organism, the expression 
induced by the promoter can easily be monitored by 
following the GFP expression. To monitor the 
subcellular expression of a protein, it suffices to 
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make a fusion between the protein of interest .and the 
GFP protein, this can be done at the N-terminal site 
or at the C-terminal site of the GFP protein, and even 
internal fusions are possible . Plasmids pGR3, pGR7 
5 and pGR13 are good examples of such fusion proteins as 
they contain a 109 throredoxin Aminoacid fragment in 
fusion with the Polythoa 2 GFP. This fusion protein 
shows clear fluorescence. 
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TABLE 1 



10 



15 



20 



25 



30 



35 



40 



45 



primers 


5—3* 


o6R1 


CACCACATGGAAGGAWRYKTNRAYGG 


oGR2 


ACCACATGGAAGGATGCKTNRAYGGNCA 


oGR3 


AATTTGTGATCAAGGGCRARGGNRWNGG 


o6R4 


GTGATCAAAGGTGGACCNYTNCCNTT 


oGR5 


GAGATATTGTCAACAGAGTTYMANTAYG 


oGR6 


CATATTGTGAACAGAGTTYMANTAYGG 


oGR7 


ATCCTGAGGACATACCAGAYTAYHWNAA 


oGR8 


GACTATTTCAAGCAGTCGTKYCCNGMNGG 


oGR9 


CATGGGAAAGGTCCTTGCAYTWYGARGA 


oGRIO 


GGTGACATCTCCTTTCARNAYNCG ' 


.oGRII 


CATATTCTCAGTGGANGSNTCCCA 


0GR12 


CACAGGTCCATCGSNAGGRAARTT " 


oGR13 


CCATCGGCAGGAAARTTNANNCC 


0GR14 


TGAATACCCTGTTTCCRTANTKRAA 


0GRI6 


AAGCAGTGGTATCAACGCAGAGT 


0GRI8 


AAGC AGTGGTAAC AAC G C AG AGT 


OGR20 


GTAATACGACTCACI A 1 AGGGCCGCAGI CGACCGl 1 I 1 1 1 1 1 1 1 t 1 1 


oGR21 


A A Af^r^P(^T(^ n nnCTTH nTTTnGnTTTGR A 

/"SrvWjyjy^yD l v^vy V^Vi/w 1 1 1 i 1 wo I 1 I Ww/^ 


0GR22 


TGTCAACAGCATTCCAGTATGGCAACAGGGTA 


oGR23 


TGAAGAGGGCGTTTGCACCACAAAGAGTG 


oGR24 


AAAGGGRAGAAGCTTGACCCCAACGGCC 


oGR25 


TTftAAAfinAttTCTGGTTGGCCTTTCTTGA 

1 1 w/WnVSWr^W 1 W 1 WW 1 1 WWWW 1 1 1 V/ 1 1 w/^ 


OGR26 


TGTGGTGCAAACGCCCTCTTCATATTTGAA 


0GR27 


r nnTGTTGCr: ATACTGGAATGCTGTTG AC 


oGR28 


AAGGAAGGGGCACGCCTTTAGTGACTGTAAG 

^V^wwTVAwNSw w v/r^wwv/w 1 1 ir>w 1 wr^w i w irv^w 


OGR29 


CTTGCCTTGTCCCTCTCCCGTGATCGTGA 

^y 1 1 ^J^y^y 1 1 W# 1 ^y^yvy i Vy 1 X^X^N^X^ 1 X^»^ 1 X*X^ 1 X^*^ 


oGR30 


GTAATACGACTCACTATAGGGCAAGCAGTGGTATCAACGCAGAGT 


0GR31 


GTAATACGACTCAGTATAGGGG 


OGR32 


ACCTTGGTGATTrGGGAGAAGGCAGATCGAGAG 

1 1 X^ 1 X^J» III X^X^ V^'»X^#«r»X^X^X^#»X^#» I X^X^r %X^r »X^ 


o6R33 


CTTGGTGATTTGGGAGAAGGCAGATCGAG 

1 1 X*Jx^ 1 III X.^ X^ \^/»Xi/#»r\X^X** X##»X^#% 1 X^X^ff*^*' 


0GR34 


CGrcriGGCI 1 I ICG 1 lAAGCCl ITACIGGGG 


0GR35 


GAGAAAC i I C 1 N 1 1 CAC 1 1 1 G 1 1 G ! CG 1 C 1 1 G 


OGR36 


G ACACTGGTG ATTTG GG AGAAGGC AGATC 

1 X^ Xn* 1 Xi^# \ III Xrf Xyxy#lX^#T#»X^Xy>^» •Xi'# » 1 X^ 


oGR37 


ATTGCGAGCCACGGGAACTTCATACAGC 

r» 1 1 ^J^^^^/^X^ X^ X^»iX^X^X^Xir#^r%x^ 1 1 x^/» 1 r •x^* \x^ x^ 


OGR38 


GCGATAATCTGAAGAGGAGAATTGCGAGCGAG 


OGR39 


GGAGAAGGAGAAGGAAAACCATAC6AGGG 


OGR40 


CCAGTACGGCAACAGGGCATTCACCAAAT 


0GR41 


GGGAAAGAACCATGAATTTTGAAGACGGG 


OGR42 


CCCGCGATTGGCCCAGTTATGCAGAAGAA 


0GR43 


GCCAATGGGGGGAAAGTTCGCACCATCAA 


06R44 


CGCCCCCGTCTTCAAAATTCATGGTTCTT 


OGR45 


CGTGTTGCCGTACTGGAACGCTGTTGTCA 


OGR46 


TGGGAAGTCTTATGATGGCACCAATACCG 


0GR47 


TTCAGGTAACCAAGGGTGGACCTCTGCCA 


OGR48 


TGTCAGGCATCCCGAAGACATCGCTGATT 


OGR49 


CATGCACTrTGAAGACGGTGGCGTGTGTT 
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O6R50 


TCATTGGTGATACAACACACGCCACCGTC 


oGR51 


CATGACCCTTTCCCATGTAAATCCTTCGGGA 


oGR52 


TTGTGGTGACAAAATAGGCCAAGCAAATGGC 


OGR53 


GAAATAAAAGGCGAGGGTCACGGGAAGCC 


oGR54 


CATGGTAACGAAGGGTGGACCGCTGCCAT 


OGR55 


AAANGTGTCGTTTCCCGAGGGATTrAGAT 


OGR56 


TGGCGTGAT7TGCAGCNCCAATGATATCA 


OGR57 


CGCCACCGNGTTCAAAGTGCATGACCCTT 


oGR5d 


ANCGGCTATGTCTTCAGGGTGCTTGACAA 


OGR59 


GGTGCACCCTTGGTTACCATGAGCTTGACGTT 


OGR68 


CTGGAATTCTATTAGTTTGAGTCTACCATCATGAGTGCAATT 


OGR69 


CTGGAATTCTCTACCGTCATGAGTGCAATTAAACCAGTCA 


t-oGR70 


CGTATCTCGAGATTGCGAGCCACGGCAACTTCATACAGC 


Oorvf 1 


OV3 1 A 1 0 1 OoAooUoA 1 MA i L» 1 oAAoAl3oAwV\ 1 1 oLroAvsLrUMo 


OGR72 


GGTATCTCGAGCGTCTTGGCTTTTCGTTAAGCCTrTACTGGGG 


OGR74 


CGTGGGCGGGCCACCACCATGAGTGCAATTAAGCCAGTTATGAA 


OGR75 


CGTCGGCGCGCCATCATGAGTGCAATTAAACCAGTCATGAAGAT 


oGRgo 


GACCCCAACGGCGCAATTATGCAGAAGAAGACCCTGAAATGGGAG 


0GR91 


CTCCCATTTCAGGGTCTTCTTCTGCATAATTGGGCGGTTGGGGTG 


OGR96 


CGTACGTCGAGCCTTTACTTGGTCAGGCGGCTCGGGAGCTTGG 


OGR97 


GGTAGCTGGAGGATGGATCCTTTACTTGGTCAGCCG 
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Tabe 2 Primer combinations 



10 



15 



20 



25 



30 



35 



40 



45 



OvJlvl 


OvJIviu 






nHP t 1 
OUIvi 1 










e\m> 1 

OOKl 






OuKl 






OUKl 










OxJIvJ 1 


oGRll 


oGR30 


oGR31 


0GR12 


oGR30 


0GR31 


oGRl3 


oGR30 


oGRBl 


0GR14 


oGR30 


oGR31 


OuKlO 






oGRlO 


0GK20 




0GR16 


oGR27 




0GR16 


oGR28 




0GR16 


oGR29 




oGR2 . 


oGRlO 




oGR2 


oGRl 1 




oGR2 


0GRI2 




oGR2 


0GRI3 




oGR2 


0GRI4 




oGR2 


OGR20 




0GR21 


OGR20 




oGR22 


0GR2O 




OGR23 


OGR20 




OGR24 


OGR20 




oGR25 


0GRI6 




oGR25 


0GRI8 




0GR2S 


OGR30 


oGR31 


oGR26 


0GRI6 




oGR26 


0GRI8 




oGR26 


oGR30 


oGR31 


oGR27 


0GRI6 




OGR27 


0GRI8 




0GR27 


oGR30 


oGRBl 


OGR28 


0GRI6 




OGR28 


0GRI8 




oGR28 


OGR30 


oGR3l 


OGR29 


0GRI6 




oGR29 


0GRI8 




OGR29 


oGR30 


oGR31 


oGR3 


oGRlO 




oGR3 


oGRll 




oGR3 


0GRI2 
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oGR3 


oGR13 






oGR3 


0GR14 






oGR3 


oGR20 






oOR32 


oGK33 




5 


oGR32 


oGR34 






oGR32 


oGR35 






oGR33 


oGR34 






oGR33 


oGR3S 






oGR34 


oGR35 




10 


oGR36 


oGR37 






oGR36 


oGR38 






''oGR39 


oGR20 






oGR4 


oGRlO 






oGR4 


oGRll 




15 


oGR4 


oGR12 






oGR4 


oGRi3 






oGR4 


0GR14 






oGR40 


OGR20 






oGR4i 


OGR20 




20 


oGR42 


OGR20 




— 


oGR43 


0GR16 






OGR43 


0GR18 






OGR43 


OGR30 


oGR3l 




OGR44 


0GR16 




25 


OGR44 


0GR18 






OGR44 


oGR30 


0GR31 




oGR44 


oGR31 






oGR45 


0GRI6 






oGR45 


0GRI8 




"3 n 
JU 


oGR45 


oGR30 


oGR3l 




oGR45 


oGR3l 






oGR46 








oGR47 


oGR20 






oGR48 


oGR20 




JO 


0uK4y 


OOKzU 






oGR5 


oGRlO 






oGR5 


oGRll 






oGR5 


0GRI2 






oGR5 


0GRI3 




40 


oGR5 


oGRU 






OGR50 


0GRI6 






OGR50 


0GRI8 






OGR50 


OGR30 


0GR31 




0GR51 


0GRI6 
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oGR51 


oGRlS 






oGR51 


oGR30 


oGRBl 




oGR51 


oGRSl 






oGR52 


oGRi6 




5 


oGR52 


0GR18 






oGR52 


oGR30 


oGRBl 




oGR52 


oGR31 






oGR53 


oGR20 ^ 






oGR54 


OGR20 




10 


oGR55 


OGR20 






oGR56 


oGR20 






oGR57 


0GR16 






OGR57 


0GRI8 






oGR57 


oGRBO 


0GR31 


15 


OGR58 


0GRI6 






OGR58 


0GRI8 






oGR58 


OGR30 


oGRBl 




oGR59 


0GRI6 






oGR59 


0GRI8 




20 


oGR59 


oGR30 


0GR31 




oGR6 


oGRlO 






oGR6 


oGRtl 






oGR6 


0GRI2 






oGR6 


OGRI3 






OOKOo 


OOK/Z 






oGRo9 


0GR7O 






oOKoy 


OUK/ 1 






ouKoy 








oOKoy 










OOKIU 






OOK/ 


OUKi 1 






OCjK/ 








oCjK/ 








OCjK/Z 


OLjK/4 




OD 


OVJIx/ J 








oGR75 


OGR97 






oGR8 


oGRlO 






oGR8 


oGRll 






oGR8 


0GRI2 




40 


oGR8 


0GRI3 






oGR9 


oGRlO 






oGR9 


oGRll 






oGR9 


0GRI2 






oGR9 


0GRI3 
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CLAIMS: 



1. An isolated nucleic acid molecule encoding a 
fluorescent protein comprising an amino acid sequence 

5 illustrated in any of the polypeptide sequences of 
figures, 3 (a) to 3(d) or functional equivalents, 
fragments or variants thereof. 

2. An isolated nucleic acid molecule encoding a 
10 protein capable of emitting fluorescence upon 

irradiation by incident light, wherein said maximal 
absorbance of said incident light is in the range 440- 
480 nm, and maximal fluorescence emission is in the 
range 470-510 nm. 

15 

3. An isolated nucleic acid molecule according 
to claim 2, wherein said molecule encodes 'a protein 
having an amino acid sequence as depicted in any of 
the polypeptide sequences of Figures 3(a) to 3(d). 

20 

4. An isolated nucleic acid molecule according 
to claim 1 wherein said fluorescent protein comprises 
an amino acid sequence having combined polypeptide 
sequences from at least 2 of the polypeptide sequences 

25 depicted in Figures 3(a) to 3(d). 

5. An isolated nucleic acid molecule according 
to claim 4 wherein said protein comprises a Polythoa 
2-Discosoma 1 hybrid having the sequence illustrated 

30 in Figure 1. 

6. An isolated nucleic acid molecule encoding a 
fusion protein comprising an amino acid sequence 
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depicted in any of Figures 3(a) to 3(d) together with 
a nucleotide sequence encoding a protein of interest. 

7. An isolated nucleic acid molecule according 
5 to claim 6 wherein said fusion protein comprises the 

amino acid sequences depicted in Figures 4 and 5. 

8. An isolated nucleic acid molecule according 
to claim 5 wherein said protein of interest is an 

10 * antibody. '^'^ 

9. An isolated nucleic acid molecule according 
to any of claims 1 to 8, which is a DNA molecule. 

15 10. An isolated nucleic acid molecule according 

to claim 9, wherein said DNA molecule is cDNA. 

11. An isolated nucleic acid molecule according 
to any of claims 1 to 10, which is derived from an 

20 Anthozoa species. 

12 . An isolated nucleic acid molecule according 
to claim 11, wherein said Anthozoa species is any of a 
Polythoa or Discosoma species. 

25 

13. An isolated nucleic acid molecule according 
to any preceding claim, wherein said molecule 
comprises a nucleotide sequence which has at least 70, 
preferably at least 80, more preferably at least 90 

30 and even more preferably at least 95% sequence 

identity to the nucleic acid sequences depicted in 
Figure 1, 
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14. An isolated nucleic acid molecule according 
to any preceding claim, wherein said nucleic acid 
molecule comprises any of the nucleic acid sequences 
depicted in Figure 1. 

5 

15. An isolated nucleic acid molecule according 
to claim 13 comprising any of the nucleotide sequences 
depicted in Figure 2(a) or 2(b). 

10 ' 16, An antisense molecule capable of 

hybridising to a nucleic acid molecule according to 
any of claims 1 to 13, under conditions of high 
stringency. 

15 17. An isolated fluorescent protein or 

- functional equivalent, derivative or variant thereof 
encoded by a nucleic acid molecule according to any of 
claims 1 to 13. 

20 18. An isolated fluorescent protein capable of 

emitting fluorescence upon irradiation by incident 
light wherein the maximal absorbance of said incident 
light is in the range 440-480 nm, and maximal 
fluorescence emission is in the range 470-510 nm. 

25 

19. An isolated fluorescent protein comprising 
an amino acid sequence which has at least 70, 
preferably at least 80, more preferably at least 90 
and even more preferably at least 95% sequence 
30 identifying to the amino acid sequence depicted in 
Figures 3 to 8. 



20. An isolated fluorescent protein comprising 
an amino acid sequence corresponding substantially the 
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-sa- 



ls 



20 



25 



polypeptide sequences depicted in any of Figures 3 to 
8. 

21. An isolated fusion fluorescent protein 
comprising a fluorescent protein according to any of 
claims 16 to 20 together with the amino acid sequence 
of a protein or polypeptide of interest. 

22. A fluorescently labelled antibody or a 
paratope thereof coupled to a fluorescent protein-"- 
according to any of claims 16 to 20. 

23. An expression vector comprising any of the 
nucleic acid molecules according to claims 1 to 15. 

24. An expression vector comprising any of the 
plasmid sequences depicted in Figures 9 to 14 . 

25. An expression vector comprising the 
sequences of any of plasmids pGR8 to pGR20. 

26. A host cell transformed or transfected with 
an expression vector according to any of claims 23 to 



27. A prokaryotic cell transformed or 
transfected with any of expression vectors pGR3, pGR7 
depicted in Figures 9 and 13 or pGR13 . 

28. A prokaryotic cell according to claim 25 
which is E.coli. 



29. A eukaryotic cell transformed or 
transfected with an expression vector corresponding 
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15 



20 
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substantially to the plasmids designated pGR4 or PGR5 
in Figures 10 or 11. 

30. A transgenic cell tissue or non-human 
organism comprising a transgene capable of expressing 
a fluorescent protein according to any of claims 17 to 
21 or an antibody according to claim 22. 

31. A transgenic cell, tissue or non -human 
organism according to claim 30, wherein said transgene 
is included in an expression vector. 

32. A transgenic cell, tissue or non-human 
organism according to claim 31, wherein said vector is 
one according to claim 23. 

33. A transgenic cell, tissue or non-human 
organism wherein said non-human organism is C-elegans 
and said transgene substantially corresponds to a 
nucleotide sequence as depicted in Figure 12. 

34. A fluorescent protein, or a functional 
equivalent, derivative or bioprecursor thereof, 
expressed by a cell, tissue or organism according to 
any of claims 27 to 33. 

35. A process for producing the protein of any 
one of claims 17 to 21, comprising the steps of 
cultivating a cell tissue or organism according to any 
of claims 24 to 33 under conditions suitable for 
expression of the protein and optionally recovering 
the expressed protein. 
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36. An oligonucleotide probe comprising at 
least about 10 nucleotides of a nucleotide sequence 
that is capable of selectively hybridising to a 
nucleic acid molecule according to any of claims 1 to 

5 15. 

37. A method for selecting cells capable of 
expressing a protein of interest, comprising 
introducing into said cells a vector comprising the 

10 nucleotide sequence of a fluorescent protein according 
to any of claims 17 to 22 operatively linked to a 
promoter or regulatory region of the protein of 
interest, cultivating the cell under conditions 
necessary for expressing the protein of interest and 

15 monitoring for any fluorescence following expression 
of said fluorescent protein. 

38. A method for producing fluorescence 
resonance energy transfer comprising; 

20 providing a donor molecule comprising a 

fluorescent protein according to any of claims 17 to 
21/ 

providing an appropriate acceptor molecule for 
the fluorescent protein; and 
25 bringing the donor molecule and acceptor molecule 

into sufficiently close contact to allow fluorescent 
resonance energy transfer. 

39. A method for producing fluorescence resonance 
30 energy transfer comprising; 

providing an acceptor molecule comprising a 
fluorescent protein according to any of claims 17 to 
21; 
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providing an appropriate donor molecule for the 
fluorescent protein; and 

bringing the donor molecule and acceptor molecule 
into sufficiently close contact to allow fluorescence 
5 resonance energy transfer. 

40. A microscopic nematode comprising a transgene 
capable of expressing a fluorescent protein according 
to any of claims 17 to 20. 

10 — 

41. A nematode according to claim 40 which is 
C.elgans. 

42. A fluorescent protein obtainable from the coral 
15 species Anthozoa. 

43. A fluorescent protein according to claim 41 which 
is obtainable from Discosoma or Polythoa. 

20 44. A fluorescent protein according to claim 42 or 43 
which is capable of emitting fluorescence upon 
irradiation by incident light wherein the maximal 
absorbance of said incident light is in the range 440- 
480 nm, and maximal fluorescence emission is in the 

25 range 470-510 nm. 

45. A fluorescent protein according to claim 42 or 43 
comprising an amino acid sequence which has at least 
70, preferably at least 80, more preferably at least 
30 90 and even more preferably at least 95% sequence 
identifying to the amino acid sequence depicted in 
Figures 3 to 8 
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CDNA fragment of polythoa 1 encoding for a fluorescent protein 
Start ccydon ATG and stop codon TAA are indicated. 

1 acgcggggat tcaccttggt gat ttgg gag aaggcagatc gagagcaaga gtcagtgtaa 
61 taacttactt tgagtctacc atc |at^ gtg caattaagcc agttatgaag gtagaattgg 
121 tcatggaagg aaatgtgaac gggcacaagt tcacgattac aggagaggga caaggcaagc 
181 cttacgaggg aactcacact ctaaacctta cagtcacaaa aggcgggccc cttcctttcg 
241 cttacgatat cttgtcarca gcattccagt acggcaacag ggtatttacc aaatacccag 

3 01 aagatatacc ggactatttc aagcagacct ttccagaagg atattcgtgg gaaagaactt 
361 tcaaatatga cgagggcctt tgcaccacaa aaagtgacat atgcctcaag aaaggcgaac 
421 cggactgctt tcaatacaaa atttactttg aagggaagaa ccttggcccc agcggtccaa 

4 81 ttatgcagaa gaagaccctg aaatgggagc catccactga gaggatgtac atggacgtgg 
541 ataaagacgg tgcaaaggtg ctgaagggcg atgataatgc ggccctgttg cttgaaggag 
601 gtggccatta tcgttgtgac ttcaatagta tttacaaggc gaagaaaact gggtccttgc 
661 cagcatatca ctggatagac caccgcattg agattttgag ccacgataaa gattacaaca 
721 a ggtta caat gcatgaattt gccgctgctc gtaattctcc ttttccgata atggcgcccc 
781 ag ^aaj aggct taacgaaaag ccaagacgac aacaaagtga aaaagaagtt tctcgtttac 
841 ttttttctga aggcatttat cactaattag cttttgatag ttttgattca cggattcgat 
901 ccatgaattt cttagggact agctctagaa taaatgattg tgaaacaaaa actagttttc 
961 atattttgcg agatttttca cttcataaag acagactttt taaactcagt tgta^ccaaa 

1021 tacaaataag gaaagtgtat taagaattaa acaaacttgt tgtggaaaaa taateiaaeuic 
1081 ggtcgactgc ggccctataa tgagtcgtat tac 

i i ' I 

CDNA fragment of polythoa 2 encoding for a fluorescent protein 
Start codon ATG and stop codon TAA are indicated. 

1 acgcggggac actggtgatt tgggagaagg cagatcgaga gcaagagtca gtgtaataac 
61 ttactttgag tctaccgtcg [tg|agtgcaat taaaccagtc atgaagattg aattggtcat 
121 ggaaggagag gtgaacgggc acaagttcac gatcacggga gagggacaag gcaagcctta 
181 cgagggaaca cagactctaa accttacagt cactaaaggc gtgccccttc ctttcgcttt 
241 cgatatcttg tcaacagcat tccagtatgg caacagggta tttaccaaat acccagatga 

3 01 tataccggac tatttcaagc agacctttcc ggaaggatat tcgtgggaaa gaactttcaa 
3Lg;L atatgaagag ggcgtttgca ccacaaagag tgacataagc ctcaagaaag gccaaccaga 
421 ctgctttcaa tataaaatta actttaaagg ggagaagctt gaccccaacg gcccaattat 

4 81 gcagaagaag accctgaaat gggagccatc cactgagagg atgtacatgg acgtggataa 
541 agacggtgca aaggtgctga agggcgatgt taatgcggcc ctgttgcttg aaggaggtgg 
601 ccattatcgt tgtgacttta acagtactta caaggcgaag aaaactgtgt ccttcccagc 
661 atatcacttt gtggaccacc gcattgagat tttgagccac aatacggatt acagcaaggt 
721 tacactgtat gaagttgccg tggctcgcaa ttctcctctt cagattatgg cgccccagjt^ 
7 81 gaggcttaac gaaacgccaa tacgacaaca aagtgaaaaa caagtttttc gttatttttt 
841 tctgaaagca tttatcacta attagctttt gatagttttg attcacggat tcgatccgga 
901 atttaatagg gactagctct agtctagaat aaacgattgt gtaacaaaaa ctagctttca 
961 taatttgcgg gatttttcac ttcataaaga cagacttttt aaactcagtt gtagccaaat 

1021' acaaataagg aaagcgtatt aagaattaaa caaacttgtt gtcgaaaaaa aaaaaaacgg 
1081 tcgattgcgg ccctatagtg agtcgtatta c (4J 



CDNA fragment of discoscma 1 encoding for a fluorescent protein 
stop codon TAA are indicated. 

1 caccacatgg aaggaagtgt ggacgggcaa aatttcgtga tcactggaga aggagaagga 
61 aaaccatacg agggaacaca tgttatagac ctgcaagtcg ttgaaggcgg acctctgcgt 
121 ttcgcttacg atatcttgac aacagcgttc cagtacggca acagggcatt caccaaatac 
181 ccatcagata ttcctgacta tttcaagcag acttttcctc aagggtatac atgggaaaga 
241 accatgcact ttgaagacgg tggcgtgtgt accgtcaata gcgacgtaag cctgaaaagc 
301 ggctgttttg agtataaaat tcgttttgat ggtgagaact ttccccccaa tggcccagtt 
361 atgcagaaga agactgtgaa atgggagcca tccactgaga acatgtatga gcatgatggg 
421 atgctgaagg gtgatgttag cagaactctg ttgcttgaag gaggtggcca ttaccaatgc 
481 gactttaaaa ctatttacaa agcgaaggac agccagggaa tcaagatgcc agaatatcac 
541 tttgtggacc accgcattga gattttgagc catgacaaag attacaagat ggtcaaggtg 
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601 tatgagattg ccgaagctca ctattccaag ctgccgagcc ggctgaccaa g [taa| aggcct 

661 aaggaaaagc caacaagcca acaaggagga aaaaatacta gtgtttctag tacagttttt 

721 taagccattt actaggaatt agtttttaat acttcagatc gtttcgggat ttgttagaga 

781 ttagcttacg aaaactgata ctcctagagt ttctagtatt gtttttaagc catttactcg V^) 

841 gaattagttt ttgatacttt agatcgtttc ggaatttgtt agagtttagc tttaaaaaaa {fAl^S 

901 tactagactg v u^*-r^y 



1 



CDNA fragment of discosoma 2 encoding for a fluorescent protein 



1 caccacatgg 
61 aagtcttatg 
121 tttgcttggc 
181 cccgaagaca 

2 41 gtcatgcact 

3 01 aactgtttca 



aaggaagtgt tgacggccac tactttgaaa ttaaaggcaa tggatatggg 
atggcaccaa taccgtaaag cttcaggtaa ccaagggtgg acctctgcca 
ctattttgtc accacaattt caatatggaa acaagatatt tgtcaggcat 
tcgctgatta taaaaagctg tcatttcccg aaggatttac atgggaaagg 
ttgaagacgg tggcgtgtgt tgtatcacca atgatatcag tttggaaggc 
tctaccacat caatttcatt ggcttgaact ttccttccga tggacctgtg 
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DNA fragment encoding polythoa 2 with optimal codon usage for C. elegans as in 
plasmid pGRlO. 

1 atgtccgcta tcaagccagt catgaagatc gagctcgtca tggagggaga ggtcaacgga 
61 cacaagttca ccatcaccgg agagggacag ggaaagccat acgagggaac ccagaccctc 
121 aacctcaccg tcaccaaggg agtcccactc ccattcgctt tcgacatcct ctccaccgct 
181 ttccagtacg gaaaccgtgt cttcaccaag tacccagacg acatcccaga ctacttcaag 
241 cagaccttcc cagagggata ctcctgggag cgtaccttca agtacgagga gggagtctgc 
3 01 accaccaagt ccgacatctc cctcaagaag ggacagccag actgcttcca gtacaagatc 
361 aacttcaagg gagagaagct cgacccaaac ggaccaatca tgcagaagaa gaccctcaag 
421 tgggagccat ccaccgagcg tatgtacatg gacgtcgaca aggacggagc taaggtcctc 
481 aagggagacg tcaacgctgc tctcctcctc gagggaggag gacactaccg ttgcgacttc 
541 aactccacct acaaggctaa gaagaccgtc tccttcccag cttaccactt cgtcgaccac 
601 cgtatcgaga tcctctccca caacaccgac tactccaagg tcaccctcta cgaggtcgct 
661 gtcgctcgta actccccact ccagatcatg gctccacag 

DNA fragment encoding polythoa 2 with optimal codon usage for C .elegans further 
including introns. the introns are underlined. Furthermore the startTng codon 
ATG is preceded by a 5' UTR containing an Kozak site. 

I'tggctagcgt cgacggtacc ggtagaaaaa atgtccgcta tcaagccagt catgaagatc 
61 gagctcgtca tggagggaga ggtcaacgga cacaagttca ccatcaccgg agagggacag 
121 ggaaagccat acgagggaac ccagaccctc aacctcaccg tcaccaaggg agtcccactc 
181 ccattcgctt tc gtaagttt aaacatatat atactaacta accctgatta tttaaatttt 
241 cagg acatcc tctccaccgc tttccagtac ggaaaccgtg tcttcaccaa gtacccagac 
301 gacatcccag actacttcaa gcagaccttc ccagagggat actcctggga gcgtaccttc 
361 aagtacgagg agggagtctg caccaccai ag taagtttaaa cagttcggta ctaactaacc 
421 atacatattt aaattttcag gtccgacatc tccctcaaga agggacagcc agactgcttc 
481 cagtacaaga tcaacttcaa gggagagaag ctcgacccaa acggaccaat catgcagaag 
541 aagaccctca agtgggagcc atccaccgag cgtatgtaca tggacgtcga caaggacgga 
601 gctaaggtcc tcaa ggtaag tttaaacttg gacttactaa ctaacgqatt atatttaaat 
tttcag ggag acgtcaacgc tgctctcctc ctcgagggag gaggacacta ccgttgcgac 
721 ttcaactcca cctacaaggc taagaagacc gtctccttcc cagcttacca cttcgtcgac 
781 caccgtatcg agatcctctc ccacaacacc gactactcca aggtcaccct ctacgaggtc 
841 gctgtcgctc gtaactcccc actccagatc atggctccac agtagggccg gccgagctcc 
901 gcatcggccg ctgtc 
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Protein sequence of polythoa 1 

1 MSAIKPVMKV ELVMEGOTNG HKFTITGEGQ GKPYEGTHTL NLTVTKBGPL PPAYDIIiSHa 
61 PQYGNRVPTK YPEDIPDYFK QTPPEGYSWE RTFKYDEGLC TTKSDICLKK GBPDCFQYrac 
121 YFEGKNLGPS GPIMQKKTLK WEPSTERMYM DVDKDGAKVL KGDDNAALIiL EGGGHYRCDP 
181 NSIYKAKKTG SLPAYHWIDH RIEILSHDKD YWKVTMHEFA AARNSPFPIM APQ* 

Protein sequence of polythoa 2 

1 MSAIKPVMKI ELVMEGEVNQ HKFTITGEGQ GKPYEGTQTL NLTVTKSVPL PPAe!dII.STA 
61 FQYGNRVFTK YPDDIPDYFK QTPPEGYSWE RTFKYEEGVC TTKSDISWCK GQPliCPQYKI 
121 NPKBEKLDEN GPIMQKKTLK WEPSTERMYM DVDKDGAKVI. KGDVNAALIiL EGGGHYRCDP 
X81 NSTYKAKKTV SPPAYHFVDH RIEILSHNTD YSKVTLYEVA VARNSPLQIM APQ* 



Protein sequence of the N-terminal part of discosoraa 1 ' 

1 HHMEGSVDGQ NFVITGEGEG KPYBGTHVID LQWEGGPIiR PAYDILTEAP QYGNRAFTKY 
61 PSDIPDYFKQ TFPQGYTWER TMHFEDGGVC TVNSDVSLKS GCFEYKIRFD GENFPPNGPV 
121 MQKKTVKWEP STKNMYEHDG MLKGDVSRTL LLBGGGHYQC DFKTIYKAKD SQGIKMPBYH 
181 FVDHRIEILS HDKDYKMVKV YEIAEAHYSK LPSRLTK* 

Protein sequence of an Internal part of discosoraa 2 

1 HHMEGSVDGH YFBIKGNGYG KSYD3INTVK LQVTKGGPLP FAWPILSPQF QYGNKXPVRH 
61 PEDIADYKKL SFPBGPTWER VMHPEDG6VC CITNDISIiEG NCPIYHINPI GLITPPSDGPV 



r 
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polythoa 2 fiuorescent fusion protein in pGR3 

MSDKIIHLTDDSFDTOVIJCftDGAIIiVDFWAEWCGPCKMIAPIIiDEIADEYQGKLT^ 
YGIRGIPTLIiPKNGEVAATKVGALSKGQLKEFLDANLAGSGSGHMi^^ 

PERQHMDSPDLGTDDDDKAMADIGSEPSTVMSAIKPVMKIELVMEGEVNGHKPTITCEGQGKPYEGTQT^ 
DLTVTKGVPLPFAFDILSTAFQYGNRVFTKYPDDIPDYFKQTFPEGYSWERTFKYEEGVCTTKSDISLKK 
GQPIX:FQYKINFKGEKIiDPNGPIMQKKTLKWEPSTERMYMDVDKDGAK^n[»KGDV^^ 
NSTYKAKKTVSFPAYHFVDHRIEILSHNTDYSKVTLYEVAVARNLEHHHHHH* 




Polythoa 2 fluorescent fusion protein in pGR7 
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US 

CKl 

CO 

CM 



pGRlO 
pGRl3 
pGRie 
pGR3 
pGR4 
pGR5 
pGR6 
pGR7 
pGR8 

P0IiYTH0A2 
consensus 

pGRl 
pGRlO 
pGR13 
pGR16 
pGR3 
p6R4 
pGRS 
pGR6 
pGR7 
pGR8 

POtiYTB0A2 . 
consensus.*.-^ 



MSDKIIHLTDDSFDTDVLKADGAILVDFWAEWCGPCKWrCAPIIiDEIMEYQGKLT^ 
HSDKI IHLTDDSFDTDVLKADGAILVDFWAEWCGPCKMIAPIIJ>6I ADEYQGKLTV^^ 

HSDKIIHLTDDSFDTDVIiKAIX3AILVDFWAEWCGPCKMIAPI3jDEXADEyQGKLl^^ 



IIX2NPGTAPKyGIR6IPT£iLLFKNGEVJU^TKVGALSKGQLKEFIjD 
ZDQNPGTAPKyGXRGIPTLIJiFKHGE VAATSCVGALS KGQliK£FLDA29LAGSGSGHKHHHH 

IDQNPGTAPKyGIRGXPTLjUiFXNGBVAATKVQALSK6QIiKEPU3iUIIiAOSGSGHKH^ 



pGRl 

pGRlO 

pGR13 

pGRl6 

pGR3 

pGR4 

pGR5 

pGR6 

pGR7 

pGRS 

P0LYTHDA2- 
consensus 

pGRl 
pGRlO 
pGR13 
pGR16 
pGR3 
pGR4 
pGR5 
pGR6 
pGR7 
•pGR8 



HHSSGLVPRGSGMKBTAAARFERQHMDSPDIiGTDDDOKAMADIGSEF^nrFESTI 
HHSSGIlVPBGSGMECBTAAAKFERQH^1DSFDIlGTDDDDKAMADZGSEF TV 



HHSSGLVPRGSGMKETAAAKFERQHMDSPDLGTDDDDKAMADIGSEFinrFBST 



MSAIKP 
MSAiKP 
MSAIKF 
MSAIKP 
iLMSAIKP 
HSAIKF 
MSAIKP 
MSAIKP 
iMSAIKF 
MSAIKP, 
MSAIKP, 



MSAIKP 



^/iyiKIBL™SGEVl^GHKPTITGEGQGKPYEGTQTLMLT.VTKGVPLPFAFDILSTAFQYGNP. 
WIKIBLVMEGEVl^GHKFTiTGEGQGK 

vjyiKIELViVIEGEVNGHKFTITGEGgGKPYEGTQTLMIiTVTKGVPLP.FAFDILSTAFQYGNR 

W1KIEL\^>1EGEWIGHK?TITGEGQGKPYEGTQTLNLOTTKGVFLPFA?DILSTAFQYGKR 

\^IKIELVI^EGEVKGHiCFTITGEGQGKPYEGTQTLgLfVTKGVPL^ 

yt4KIELYMEGEVHGHKFTITGEGQ3KPYEGTQTL2t.TVTKGVPLPFAFblL^ 

VC'lklELVMEGEVNGHKFTITGEGgGKPYEGTQTLSLr^/TkGVFLPFAFDiLST^^ 

VMKIELVMEGEV^IGHKFTITGEGQGKPYEGTQTLWLTVTKGVPLPFAFDILSTAFQYGKR 

VMKIELVKEGEVWGHKFTITGEGQGKPYEGTQTLNLTVTKGVPLPFAFD 

vflkielve'iegevl^tgkkftxtgegqgkpyegtqtlsltvtkgvplpfaf^ 
vMkielvmegevwgkkftitgegqgkpyegtqtlnltvtkgvplpfafdilstafqygnr 



P0LYTH0A2 

consensus VMKIELVMEGEVNGHKPTITGEGQGKPYEGTQTLnLTVTKGVPLPFAPDILSTAFQYGNR 



pGRl 

pGRXO 

pGR13 

PGR16 

pGR3 

pGR4 

pGRS 

pGR6 

pGR7 

pGR8 

P0LYTH0A2 



consensus WTKYPDDIPDYFKQTFPECrrSWERTFKXEECTCFTKSDiSI^GQPDCFQYKIWPKGEK 
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pGRX 

pGRlO 

pGRia 

pGR16 

pGR3 

pGR4 

pGRS 

pGR6 

pGR7 

pGR8 



ivi^.i.iuviJraJuaAyL;i^t3iJVi^>\i-VbijbEC3U<JrtiK^jjei-Jo ^ 

ILDPHGFIMQKKTLKWEPSTEKMYMDTOICDGAKVLKGDVNAALLLEGGGHYRCDFNSTyKii- 
LDPNGPIMQKKTLKWEPSTERMraDXnDKDGAKVLKGDVNAALLLEGGGHYRCDFNSTYKJi 
tDFMGPIMQKKTLKWEPSTERMYr-^DVDIKDGAKVLKGDWAALLLEGGGHYRCDFNSTYKA 
LDPNGPIMQKKTLKWEPSTERMYMDVDKDGAK^/LKGDmzUVLLLEGGGHYRCDFNSTYK.^^^ 
LDPWGFIHQKKTLKWEPSTERHY>IDVDia)GAKVLKGDVNAALLLEGGGHYRCDFNSTYKiA 
LDPKGFIMQKKTLKWEPSTERHYMD\aDKDGAKVLKGDVTJAALLLEGGGHYRCDFNSTYKAj 
LDPHGFIMQKKTLKWEPSTERMYMEVDKDGAK^LKGDViMMLLLEGGGHYRCDFNSTYKA 
LDPWGPIMSKKTLlCWEPSTER^r/MD\a)ICDGAIWLKGDVl^JAALLLEGGGHYRCDF 
LDPITGFIMQKKTLKWEPSTERMYMDTOl^DGAICVLICGDWAALLLEGGGHYRCDFNSTYKA 
LOPMGPIMQKKTLKWEPSTERMYMDVDKDGAKVLKGDVIvrAAIiLtjEGG GHYRCDFNSTYF 



POLyTHOA2 

cons ens us IJ)PTOPIMqKKTLKWEPSTERMYMDVDKDGAK\^KGDVNAALIjLE 



pGRl 

pGRlO 

pGRia 



KKTVSFPAYKFVDHRIEILSHNTDYSKVTLYEVAVARI^IS 
KKTVS FPJiYHF VDHRI E I LSKNTDY S I0;TLY E V AVARMS 
KKWSFPAYHFVDKRIEILSKNTDYSKVTLYEVAVAJINS 
KKTVS FP AYHFVDHRI E I LSKMTDYSKVTLYEVAVARNS 

kkt vs f p ayhfvdhri e i lshntd ys fcvtlyevavarn! 
kktvsffayhfvdhrieilshntdyskvtlyevavarnI 

KKTVSFPAYHFVDHRIEILSH'MTDYSKVTLYEVAVARMS 

Ikktvsffayhfvdhrieilskhtdysicvtlyevavarms 
kktvsffayhfvdhrieilshntdysio/tlyevavarhs 
kktvsfpayhfvdhrieilsfj^itdyskvtlyevavarms 
kktvsffayhfvdhrleilshntdysprvtlyevavarms 



PLQIMAPg 
PLQIMAPQ 
PLQIMAPQ 
PLQIMAPQ 



pGRXe 

pGR4 IMSSBSBBHBSB^^ > 
pGR5 IMBSHBBSB BM . 

pGR6 

pGR7 

pGR8 IM^^iLUrlalgkViiiaL^ttiirVAaliifeUii^liiiiWgl^^f^^ 

POLYTHOAa 

COnsensus^MmSFPAYHFVDHRIBIIiSaDSniJYSKV^ 



?LQI> 
PLQII4APQI 
PLQIRAPol 
P^QIM?^ 

PLQiriAPO 



pGRl 

pGRlO 

pGR13 

p6R16 

pGR3 

pGR4 

pGR5 

pGR$ 

pGR7 

pGR8 L6TKLDA 

POLYTHQAT" 

consensus 
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hybridPolyth2-Discol ; ^. 

pGR14 --.-v— • V, . 

pGRlS MKDDIKKLTMGGSHHHHHHGMASMTGGQQMGRDLYDDDDKV^RIQCGGIRPyLGDLG^^ 

pGR17 

pGRlB 

pGR19 • 

pGR20 

pGR21 

consensus 

hybridPolyth2-Discol 

pGR14 

pGRlS SRARVSVITYPBSTVj 

. pGR17 S^Eg^i^lHVgHDiJQigV 

. pORlB 

pGRld 

pGR20 

pGR21 

consensus msaikpvmkielvmegevnghkftitGEGqaKPYBGTqtlxxLtVt 



MSAIKPyMKIELVH5GS\^GHKFTlTGEGQGK?YEGTQTLNL'g!T 

[viSAlKPVMKIEWlSGE\^ GHKFTIT GEGCGK?YEGTQtliNLTVT 

^^GEG^GKPYEGTjiJ^Lgvi" 
MSAIK?VMKIELVMEGE\1JGHKFTITGEGQGK?YEGTQTLNLTVT 
HSAlKPVr.IKIELVMHGE\^GHKFTITGEGQGKPYEGTQTLNLXyTj 
MSAIKPVWKIELVI^IEGEWGHKFTITGEGCGKPYEGTQTLNLTVT 
MSAXKP\^mKIEL\^13G3\^]GHKFTITGEGQGK?YEGTQTI:MLTVT 



hybridPoly th2 -Discol 

pGR14 

pGRlS 

pGR17 

pGRXB 

pGR19 

pGR20 

pGR2l 

consensus 



hybridPolyth2 -Dlscol 

pGR14 

pGRlS 

pGR17 

pGRlB 

pGR19 _ 

pGR20 

pGR21 

consensus 

hybridPolyth2-Discol 
pGR14 
pGR15 
pGR17 
pGRlB 
pGRl9 
pGR20 
pGR21 
' consensus 



KGVPLPFAFDILTTAFQYGNRAfTKyPSDIPDYFKQTF?QGyTWERTH>JFEDGGVCT\^ 
KGVPL■PFAFDILTTAF"QYG^TRAFTKYFSbIPDyFKGT 

kgvplpfafdilttafqygnraftkypsdipdyfkqtfpqgytwerti^fedggvctvi^js 
^gSplSfaSdilttafqygnraftkypsdipdyfkqtfpqgytwertmnfedggvc'^^ 

KGVPLPFAFDILTTi=^FQYGNRAFTKYFSDIPDYFKQTFPQGYTWERTrWEDGGVCTTO 
I<GVPLPFAFDILTTAFQYG:^RAF?K'/FSDIPDyFKQTFPQGYTWERTM^IFEDGGVCTy^ 
KGVPLPFAFDILTTAFQYGNRAFTKYPSDIPDYFKQTFPQGYTWERTMFEDGGVCTVNS 
KGVPLPFAFDILTTAFQYGMRAFTKYPSDIPDYFKQTFPQGYTWERTMMFEDGGVCTVNS 



kGvPLpFAfDILTrAFQyGNRAFTKyPSDIPDYFKQTPPQGYTWERrTOtlFBI^^ 



DVSLKSGCFEYKIRFDGENFPPNGPVTyiQKKTVKl'JEPSTEi^IiMyEHDGMLKGDVSRTLLI^^ 

OVSLK3GCFEYKIRFDGEI^^FPFi^^GPVHQKKTVI<^^FE?STEl<5^f/EHDG^iLk^^ 

DVSLK3GCFEYKIRFDGEWFPFWGPVMQKKTVKWSPSTENMYEHDGMLKGDVSRTLLLEG 
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GGHYQCDFKTIYKAKDSQGIKIViPEYHFVDHRIEILSKDKDYKWVK^/YEIAEAHYSK^ 
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GGHY(3CDFKTIYKAKDSQGIKMPEYHFVDHRIEILSHDKDYraWKVYEIAEAHYSKLP5^^ 

GGHYQCDFKTIYKAKDSQGIKJ:^IPEYHE7D^mIEILSHDKDYK^n/KVYErASAH 

GGHYQCDFKTIYKAkDSQGIKMPEYHFVTJHRiEILSI^ 

GGHYQCDFKT.IYKAKDSQGIKr.lPEYHFVDHRIEILSHDKDYKr'lVKVYEIAEAHYSKL»P^^^^ 
GGHYaCDFKTiykAKDSQGiklvrPE^^E^HRiE^ 

3GHY0CDFKTIYKAKDSQGil<I.iPSYHFVDHRIEI LSHDI<DYrcr>r^KV^EIAEAHY^^^^ 



GGHyQCTFKriyKAKDSQGIKMPBYHFVDHRIBILSHDKDYKMVKVyBIAEAHySW!^ 
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pGR19 
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pGR21 

consensus 
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New FiBurc 8b: 


PGR22 


1 


pGR24 


1 


pGR2S 


1 


pGR26 


1 


Polythoal 


1 


pGR22 


61 


pGR24 


51 


pGR25 


61 


pGR26 


61 


Polythoal 


61 


pGR22 


121 


pGR24 


121 


pGR25 


121 


pGR26 


121 


Polythoal 


121 


pGR22 


181 


pGSirr 


181 


pGR25 


181 


pGR26 


181 


Polythoal 


181 



L^JSAIK?VMKVELVHEGNVKGHKFTITG£GQGKPYEG?HTLMLTVTKGG?LPFAYDILSA;i. 
MSAIKPVMKVEL^-I-lEGHVKGHKFTITGSGQGKFYEGTnTLNLTVTKGGPLPFAYCILSAA 
MSAIKPVMKVELVMEGNVNGHKFTITGEGQGKPYEGTHTLMLTVTKGGPLPFAYDILSAA 
MSAIKPVMKVELVMSGb3VN-GHKFTITGSGQC-KPYEGTHTL>^LTVTKGG?LP?AYDILSAA 
m.<^ATK^VMKV£LVMH:GN-VKGHKFTITGSGQGK?YEG?HTLNLTVTKGGPLPFAYDILSAA 



FQYGNRVFTKYPEDIPDYFKQTFPEGYS'/JERTFKYDEGLCTTKSDICLKKGEPDCFQYKI 
FQYGNRVFTKY?SDrPDYFKQTFPEGYS';?£RTFKYDEGXiCTTKSDICLKKGEPDCFQYKI 

fqygnrvftkypedipdyfkqtfpegyswertfkydsglcttksdiclkkgepdcfqyki 

FQYGNRVFTKYPEDIPDYFKQTFPEGYSVJERTFKYDEGLCTTKSDICLKKGEPDCFQYKI 
FQYGMRVfcTKYPEDIPDY FKQTFPEG YS'/JERTFKYDEGLCTTKSDICLKKGEPDCFQYKI 



FEGKMl^PSGPrMQKKTLKWE?STERWY^]DVDKDGAKyLI<GDDNAALrjLEGGC-HYRCDF 
vFEGV^3LGP5GP]:^3QKKTLKWE?STERMYMDVE•KDGAKyLKGDDNAALLLEGGG^:YRCDl• 
YFEGX^1LGPSGPIHQKKTLKVIEPSTERMYMDVDKDGAK^/LKGDDNAALLLEGGGHYRCDF 
•^FEGK>3LGPSGPIMQKKTLKWEPSTS:WMDVDKDGAKyLKGDDNAALlLEGGGHYRCDF 
^FEGK>1LGPSGPIMQKKTT..KWF.PSTERMYMDVDKDGAKVLKGCDNAALLLEGGGHYRCDF 



NSIYKAKKTGSl?AYHWIDHRIEILSHDKDYNKVTMHEFAAARNSPF?ir-L=^PC 
NSIYKAKKTGSLPAYHWIDHRIEILSHDKDYNKVTMHEFAAARNSPFPIHAPC 
KSIY:<AKKTGSL?AYHWIDHRIEILSHDKDYNia-TMHEFAAARKSPF?Il''LAPC 
>^SIYKAKKTGSL?AYHWIDHRIEILSHDKDYNKVTMHE?AAARKSPF?IMAPC.' 
^ISIYICAI<KTGSL?AYHWIDHRXEILSHDKDY^)KVT^mH^ 
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Nucleotide sequence of pGR3 

1 aattctctac cgtcatgagt gcaattaaac cagtcatgaa gattgaattg gtcatggaag 
61 gagaggtgaa cgggcacaag ttcacgatca cgggagaggg acaaggcaag ccttacgagg 
121 gaacacagac tctagacctt acagtcacta aaggcgtgcc ccttcctttc gctttcgata 
181 tcttgtcaac agcattccag tatggcaaca gggtatttac caaataccca gatgatatac 
241 cggactattt caagcagacc tttccggaag gatattcgtg ggaaagaact ttcaaatatg 
301 aagagggcgt ttgcaccaca aagagtgaca taagcctcaa gaaaggccaa ccagactgct 
3G1 ttcaatataa aattaacttt aaaggggaga agcttgaccc caacggccca attatgcaga 
421 agaagaccct gaaatgggag ccatccactg agaggatgta catggacgtg gataaagacg 
481 gtgcaaaggt gctgaagggc gatgttaatg cggccctgtt gcttgaagga ggtggccatt 
541 atcgttgtga ctttaacagt acttacaagg cgaagaaaac tgtgtccttc ccagcatatc 
601 actttgtgga ccaccgcatt gagattttga gccacaatac ggattacagc aaggttacgc 
661 tgtatgaagt tgccgtggct cgcaatctcg agcaccacca ccaccaccac tgagatccgg 
721 ctgctaacaa agcccgaaag gaagctgagt tggctgctgc caccgctgag caataactag 
781 cataacccct tggggcctct aaacgggtct tgaggggttt tttgctgaaa ggaggaacta 
841 tatccggatt ggcgaatggg acgcgccctg tagcggcgca ttaagcgcgg cgggtgtggt 
901 ggttacgcgc agcgtgaccg ctacacttgc cagcgcccta gcgcccgctc ctttcgcttt 
961 cttcccttcc tttctcgcca cgttcgccgg ctttccccgt caagctctaa atcgggggct 
1^1 ccctttaggg ttccgattta gtgctttacg gcacctcgac cccaaaaaac titgattaggg 
1081 tgatggttca cgtagtgggc catcgccctg atagacggtt tttcgccctt tgacgttgga 
1141 gtccacgttc tttaatagtg gactcttgtt ccaaactgga acaacactca accctatctc 
1201 ggtctattct tttgatttat aagggatttt gccgatttcg gcctattggt taaaaaatga 
1261 gctgatttaa caaaaattta acgcgaattt taacaaaata ttaacgttta caatttcagg 
1321 tggcactttt cggggaaatg tgcgcggaac ccctatttgt ttatttttct aaatacattc 
1381 aaatatgtat ccgctcatga gacaataacc ctgataaatg cttcaataat attgaaaaag 
1441 gaagagtatg agtattcaac atttccgtgt cgcccttatt cccttttttg cggcattttg 
1501 ccttcctgtt tttgctcacc cagaaacgct ggtgaaagta aaagatgctg aagatcagtt 
1561 gggtgcacga gtgggttaca tcgaactgga tctcaacagc ggtaagatcc ttgagagttt 
1621 tcgccccgaa gaacgttttc caatgatgag cacttttaaa gttctgctat gtggcgcggt 
1681 attatcccgt attgacgccg ggcaagagca actcggtcgc cgcatacact attctcagaa 
1?41 tgacttggtt gagtactcac cagtcacaga aaagcatctt acggatggca tgacagtaag 
1801 agaattatgc agtgctgcca taaccatgag tgataacact gcggccaact tacttctgac 
1861 aacgatcgga ggaccgaagg agctaaccgc ttttttgcac aacatggggg atcatgtaac 
1921 tcgccttgat cgttgggaac cggagctgaa tgaagccata ccaaacgacg agcgtgacac 
1981 cacgatgcct gcagcaatgg caacaacgtt gcgcaaacta ttaactggcg aactacttac 
2041 tctagcttcc cggcaacaat taatagactg gatggaggcg gataaagttg caggaccact 
2101 tctgcgctcg gcccttccgg ctggctggtt tattgctgat aaatctggag ccggtgagcg 
2161 tgggtctcgc ggtatcattg cagcactggg gccagatggt aagccctccc gtatcgtagt 
2221 tatctacacg acggggagtc aggcaactat ggatgaacga aatagacaga tcgctgagat 
2281 aggtgcctca ctgattaagc attggtaact gtcagaccaa gtttactcat atatacttta 
2341 gattgateta aaacfctcatt tttaatttaa aaggatctag gtgaagatcc tttttgataa 
2401 tctcatgacc aaaatccctt aacgtgagtt ttcgttccac tgagcgtcaig accccgtaga 
2461 aaagatcaaa ggatcttctt gagatccttt ttttctgcgc gtaatctgct gcttgcaaac 
2521 aaaaaaacca ccgctaccag cggtggtttg tttgccggat caagagctac caactctttt 
2581 tccgaaggta actggcttca gcagagcgca gataccaaat actgtccttc tagtgtagcc 
2641 gtagttaggc caccacttca agaactctgt agcaccgcct acatacctcg ctctgctaat 
2701 cctgttacca gtggctgctg ccagtggcga taagtcgtgt cttaccgggt tggactcaag 
2761 acgatagtta ccggataagg cgcagcggtc gggctgaacg gggggttcgt gcacacagcc 
2821 cagcttggag cgaacgacct acaccgaact gagataccta cagcgtgagc tatgagaaag 
2881 cgccacgctt cccgaaggga gaaaggcgga caggtatccg gtaagcggca gggtcggaac 
2941 aggagagcgc acgagggagc ttccaggggg aaacgcctgg tatctttata gtcctgtcgg 
3001 gtttcgccac ctctgacttg agcgtcgatt tttgtgatgc tcgtcagggg ggcggagcct 
3061 atggaaaaac gccagcaacg cggccttttt acggttcctg gccttttgct ggccttttgc 
3121 tcacatgttc tttcctgcgt tatcccctga ttctgtggat aaccgtatta ccgcctttga 
3181 gtgagctgat accgctcgcc gcagccgaac gaccgagcgc agcgagtcag tgagcgagga 
3241 agcggaagag cgcctgatgc ggtattttct ccttacgcat ctgtgcggta tttcacaccg 
3301 catatatggt gcactctcag tacaatctgc tctgatgccg catagttaag ccagtataca 
3361 ctccgctatc gctacgtgac tgggtcatgg ctgcgccccg acacccgcca acacccgctg 
3421 acgcgccctg acgggcttgt ctgctcccgg catccgctta cagacaagct gtgaccgtct 
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3481 ccgggagctg catgtgtcag aggttttcac cgtcatcacc gaaacgcgcg aggcagctgc 
3541 ggtaaagctc atcagcgtgg tcgtgaagcg attcacagat gtctgcctgt tcatccgcgt 
3601 ccagctcgtt gagtttctcc agaagcgtta atgtctggct ' tctgataaag cgggccatgt 
3661 taagggcggt tttttcctgt ttggtcactg atgcctccgt gtaaggggga tttctgttca 
3721 tgggggtaat gataccgatg .aaacgagaga ggatgctcac gatacgggtt actgatgatg 
3781 aacatgcccg gttactggaa cgttgtgagg gtaaacaact ggcggtatgg atgcggcggg 
3841 accagagaaa aatcactcag ggbcaatgcc agcgcttcgt taatacagat gtaggtgttc 
3901 cacagggtag ccagcagcat cctgcgatgc agatccggaa cataatggtg cagggcgctg 
3961 acttccgcgt ttccagactt tacgaaacac ggaaaccgaa gaccattcat gttgttgctc 
4021 aggtcgcaga cgttttgcag cagcagtcgc ttcacgttcg ctcgcgtatc ggtgattcat 
4081 tctgctaacc agtaaggcaa ccccgccagc ctagccgggt cctcaacgac aggagcacga 
4141 tcatgcgcac ccgtggggcc gccatgccgg cgataatggc ctgcttctcg ccgaaacgtt 
4201 tggtggcggg accagtgacg aaggcttgag cgagggcgtg caagattccg aataccgcaa 
4261 gcgacaggcc gatcatcgtc gcgctccagc gaaagcggtc ctcgccgaaa atgacccaga 
4321 gcgctgccgg cacctgtcct acgagttgca tgataaagaa gacagtcata agtgcggcga 
4381 cgatagtcat gccccgcgcc caccggaagg agctgactgg gttgaaggct ctcaagggca 
4441 tcggtcgaga tcccggtgcc taatgagtga gctaacttac attaattgcg ttgcgctcac 
4501 tgcccgcttt ccagtcggga aacctgtcgt gccagctgca ttaatgaatc ggccaacgcg 
4561 cggggagagg cggtttgcgt attgggcgcc agggtggttt ttcttttcac cagtgagacg 
4621 ggcaacagct gattgccctt caccgcctgg ccctgagaga gttgcagcaa gcggtccacg 
4681 ctggtttgcc ccagcaggcg aaaatcctgt ttgatggtgg ttaacggcgg gata'taacat 
4741 ^gagctgtctt cggtatcgtc gtatcccact accgagatgt ccgcaccaac gcgcagcccg 
4 81dI gactcggtaa tggcgcgcat tgcgcccagc gccatctgat cgttggcaac c^g^;3.tcgca 
4861 gtgggaacga tgccctcatt cagcatttgc atggtttgtt gaaaaccgga catggcactc 
4 921 cagtcgcctt cccgttccgc tatcggctga atttgattgc gagtgagata tttatgccag 
4981 ccagccagac gcagacgcgc cgagacagaa cttaatgggc ccgctaacag cgcgatttgc 
5041 tggtgaccca atgcgaccag atgctccacg cccagtcgcg taccgtcttc atgggagaaa 
5101 ataatactgt tgatgggtgt ctggtcagag acatcaagaa ataacgccgg aacattagtg 
5161 caggcagctt ccacagcaat ggcatcctgg tcatccagcg gatagttaat gatcagccca 
5221 ctgacgcgtt gcgcgagaag attgtgcacc gccgctttac aggcttcgac gccgcttcgt 
5281 tctaccatcg acaccaccac gctggcaccc agttgatcgg cgcgagattt aatcgccgcg 
5341 acaatttgcg acggcgcgtg cagggccaga ctggaggtgg caacgccaat cagcaacgac 
5401 tgtttgcccg ccagttgttg tgccacgcgg ttgggaatgt aattcagctc cgccatcgcc 
5451 gcttccactt tttcccgcgt tttcgcagaa acgtggctgg cctggttcac cacgcgggaa 
5521 acggtctgat aagagacacc ggcatactct gcgacatcgt ataacgttac tggtttcaca 
5581 ttcaccaccc tgaattgact ctcttccggg cgctatcatg ccataccgcg aaaggttttg 
5641 cgccattcga tggtgtccgg gatctcgacg ctctccctta tgcgactcct gcattaggaa 
5701 gcagcccagt agtaggttga ggccgttgag caccgccgcc gcaaggaatg gtgcatgcaa 
5761 ggagatggcg cccaacagtc ccccggccac ggggcctgcc accataccca cgccgaaaca 
5821 agcgctcatg agcccgaagt ggcgagcccg atcttcccca tcggtgatgt cggcgatata 
5881 ggcgccagca accgcacctg tggcgccggt gatgccggcc acgatgcgtc cggcgtagag 
5941 gatcgagatc gatctcgatc ccgcgaaatt aatacgactc actatagggg aattgtgagc 
6001 ggataacaat tcccctctag aaataatttt gtttaacttt aagaaggaga tatacatatg 
6061 agcgataaaa ttattcacct gactgacgac agttttgaca cggatgtact caaagcggac 
6121 ggggcgatcc tcgtcgattt ctgggcagag tggtgcggtc cgtgcaaaat gatcgccccg 
6181 attctggatg aaatcgctga cgaatatcag ggcaaactga ccgttgcaaa actgaacatc 
6241 gatcaaaacc ctggcactgc gccgaaatat ggcatccgtg gtatcccgac tctgctgctg 
6301 ttcaaaaacg gtgaagtggc ggcaaccaaa gtgggtgcac tgtctaaagg tcagttgaaa 
6361 gagttcctcg acgctaacct ggccggttct ggttctggcc atatgcacca tcatcatcat 
6421 cattcttctg gtctggtgcc acgcggttct ggtatgaaag ajaaccgctgc tgctaaattc 
6461 gaacgccagc acatggacag cccagatctg ggtaccgacg acgacgacaa ggccatggct 
6541 gatatcggat ccg 

nucleotide Bequence of pGR4 

1 aattctctac cgtcatgagt gcaattaaac cagtcatgaa gattgaattg gtcatggaag 

61 gagaggtgaa cgggcacaag ttcacgatca cgggagaggg acaaggcaag ccttacgagg 
121 gaacacagac tctagacctt acagtcacta aaggcgtgcc ccttcctttc gctttcgata 
181 tcttgtcaac agcattccag tatggcaaca gggtatttac caaataccca gatgatatac 
241 cggactattt caagcagacc tttccggaag gatattcgtg ggaaagaact ttcaaatatg 
301 aagagggcgt ttgcaccaca aagagtgaca taagcctcaa gaaaggccaa ccagactgct 
3 61 ttcaatataa aattaacttt aaaggggaga agcttgaccc caacggccca attatgcaga 
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421 agaagaccct gaaatgggag ccatccactg 
481 gtgcaaaggt gctgaagggc gatgttaatg 
541 atcgttgtga ctttaacagt acttacaagg 
601 actttgtgga ccaccgcatt gagattttga 
661 tgtatgaagt tgccgtggct ogcaatctcg 
721 tgtcacctaa atgctagagc tcgctgatca 
781 tctgttgttt gcccctcccc cgtgccttcc 
841 ctttcctaat aaaatgagga aattgcatcg 
901 gggggtgggg. tggggcagga cagcaagggg 
961 ggggatgcgg tgggctctat ggcttctgag 
1021 tatccccacg cgccctgtag cggcgcatta 
1081 gtgaccgcta cacttgccag cgccctagcg 
1141 ctcgccacgt tcgccggctt tccccgtcaa 
1201 cgatttagtg ctttacggca cctcgacccc 
1261 agtgggccat cgccctgata gacggttttt 
1321 aatagtggac tcttgttcca aactggaaca 
13 81 gatttataag ggattttggg gatttcggcc 
1441 aaatttaacg cgaattaatt ctgtggaatg 
1501 gctccccagg caggcagaag tatgcaaagc 
1561 ggaaagtccc caggctcccc agcaggcaga 
1621 gcaaccatag tcccgcccct aactccgccc 
16,01- cat tctccgc cccatggctg actaattttt 
1741 gcctctgagc tattccagaa gtagtgagga 
1801 aagctcccgg gagcttgtat atccattttc 
1861 tttcgcatga ttgaacaaga tggattgcac 
1921 ctattcggct atgactgggc acaacagaca 
1981 ctgtcagcgc aggggcgccc ggttcttttt 
2041 gaactgcagg acgaggcagc gcggctatcg 
2101 gctgtgctcg acgttgtcac tgaagcggga 
2161 gggcaggatc tcctgtcatc tcaccttgct 
2221 gcaatgcggc ggctgcatac gcttgatccg 
2281 catcgcatcg agcgagcacg tactcggatg 
2341 gacgaagagc atcaggggct cgcgccagcc 
2461 cccgacggcg aggatctcgt cgtgacccat 
2461 gaaaatggcc gcttttctgg attcatcgac 
2521 caggacatag cgttggctac ccgtgatatt 
2581 cgcttcctcg tgctttacgg tatcgccgct 
2641 cttcttgacg agttcttctg agcgggactc 
2701 ccaacctgcc atcacgagat ttcgattcca 
2761 gaatcgtttt ccgggacgcc ggctggatga 
2821 tcttcgccca ccccaacttg tttattgcag 
2881 tcacaaattt cacaaataaa gcattttttt 
2941 tcatcaatgt atcttatcat gtctgtatac 
3001 catggtcata gctgtttcct gtgtgaaatt 
3061 gagccggaag cataaagtgt aaagcctggg 
3121 ttgcgttgcg ctcactgccc gctttccagt 
3181 gaatcggcca acgcgcgggg agaggcggtt 
3241 tcactgactc gctgcgctcg gtcgttcggc 
3301 cggtaatacg gttatccaca gaatcagggg 
3361 gccagcaaaa ggccaggaac cgtaaaaagg 
3421 gcccccctga cgagcatcac aaaaiatcgac 
3481 gactataaag ataccaggcg tttccccctg 
3541 ccctgccgct taccggatac ctgtccgcct 
3601 aatgctcacg ctgtaggtat ctcagttcgg 
3661 tgcacgaacc ccccgttcag cccgaccgct 
3721 ccaacccggt aagacacgac ttatcgccac 
3781 gagcgaggta tgtaggcggt gctacagagt 
3841 ctagaaggac agtatttggt atctgcgctc 
3901 ttggtagctc ttgatccggc aaacaaacca 
3961 agcagcagat tacgcgcaga aaaaaaggat 
4021 ggtctgacgc tcagtggaac gaaaactcac 
4081 aaaggatctt cacctagatc cttttaaatt 
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agaggatgta catggacgtg gataaagacg 
cggccctgtt gcttgaagga ggtggccatt 
cgeiagaaaac tgtgtccttc ccagcatatc 
gccacaatac ggattacagc aaggttacgc 
agcatgcatc tagagggccc tattctatag 
gcctcgactg tgccttctag ktgccagcca 
ttgaccctgg aaggtgccac tcccactgtc 
cattgtctga gtaggtgtca ttctattctg 
gaggattggg aagacaatag caggcatgct 
gcggaaagaa ccagctgggg ctctaggggg 
agcgcggcgg gtgtggtggt tacgcgcagc 
cccgctcctt tcgctttctt cccttccttt 
gctctaaatc ggggcatccc tttagggttc 
aaaaaacttg attagggtga tggttcacgt 
cgccctttga cgttggagtc cacgttcttt 
acactcaacc ctatctcggt ctattctttt 
tattggttaa aaaatgagct gatttaacaa 
tgtgtcagtt agggtgtgga aagtccccag 
atgcatctca attagtcagc aaccaggtgt 
agtatgcaaa gcatgcatct caattagtca 
atcccgcccc taactccgcc cagttccgcc 
tttatttatg cagaggccga ggccgcctct 
ggcttttttg gaggcctagg ctfct^tgcaaa 
ggatctgatc aagagacagg atgaggatcg 
gcaggttctc cggccgcttg ggtggagagg 
atcggctgct ctgatgccgc cgtgttccgg 
gtcaagaccg acctgtccgg tgccctgaat 
tggctggcca cgacgggcgt tccttgcgca 
agggactggc tgctattggg cgaagtgccg 
cctgccgaga aagtatccat catggctgat 
gctacctgcc cattcgacca ccaagcgaaa 
gaagccggtc ttgtcgatca ggatgatctg 
gaactgttcg ccaggctcaa ggcgcgcatg 
ggcgatgcct gcttgccgaa tatcatggtg 
tgtggccggc tgggtgtggc ggaccgctat 
gctgaagagc ttggcggcga atgggctgac 
cccgattcgc agcgcatcgc cttctatcgc 
tggggttcga aatgaccgac caagcgacgc 
ccgccgcctt ctatgaaagg ttgggcttcg 
tcctccagcg cggggatctc atgctggagt 
cttataatgg ttacaaataa agcaatagca 
cactgcattc tagttgtggt ttgtccaaac 
cgtcgacctc tagctagagc ttggcgtaat 
gttatccgct cacaattcca cacaacatac 
gtgcctaatg agtgagctaa ctcacattaa 
cgggaaacct gtcgtgccag ctgcattaat 
tgcgtattgg gcgctcttcc gcttcctcgc 
tgcggcgagc ggtatcagct cactcaaagg 
ataacgcagg aaagaacatg tgagcaaaag 
ccgcgttgct ggcgtttttc cataggctcc 
gctcaagtca gaggtggcga aacccgacag 
gaagctccct cgtgcgctct cctgttccga 
ttctcccttc gggaagcgtg gcgctttctc 
t9taggtcgt tcgctccaag ctgggctgtg 
gcgccttatc cggtaactat cgtcttgagt 
tggcagcagc cactggtaac aggattagca 
tcttgaagtg gtggcctaac tacggctaca 
tgctgaagcc agttaccttc ggaaaaagag 
ccgctggtag oggtggtttt tttgtttgca 
ctcaagaaga tcctttgatc ttttctaogg 
gttaagggat tttggtcatg agattatcaa 
aaaaatgaag ttttaaatca atctaaagta 
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4141 

4201 

4261 

4321 

4381 

4441 

4501 

4561 

4621 

4681 

4741 

4801 

4861 

4921 

4981 

5041 

5101 

5161 

5221 

5281 

5341 

5,401. 

5461 

5521 

5581 

5641 

5701 

5761 

5821 

5881 

5941 

6001 

6061 



tatatgagta 
cgatctgtct 
tacgggaggg 
cggctccaga 
ctgcaacttt 
gCtcgccagt 
gctcgtcgtt 
gatcccccat 
gtaagttggc 
tcatgccatc 
aatagtgtat 
cacatagcag 
caaggatctt 
cttcagcatc 
ccgcaaaaaa 
aat:attattg 
tttagaaaaa 
tcgacggatc 
tgccgcatag 
cgcgagcaaa 
gcttagggtt 
ttgattattg 
tatggagttc 
cccccgccca 
ccattgacgt 
gtatcatatg 
ttatgcccag 
catcgctatt 
tgactcacgg 
ccaaaatcaa 
cggtaggcgt 
cactgcttac 
ccgagctcgg 



aacttggtct 
atttcgttca 
cttaccatct 
tttatcagca 
atccgcctcc 
taatagtttg 
tggtat:ggct 
gttgtgcaaa 
cgcagtgtta 
cgtaagatgc 
gcggcgaccg 
aactttaaaa 
accgctgttg 
ttttactttc 
gggaataagg 
aagcatttat 
taaacaaata 

gggagatctc 

ttaagccagt 
atttaagcta 
aggcgttttg 
actagttatt 
cgcgttacat 
ttgacgtcaa 
caatgggtgg 
ccaagtacgc 
tacatgacct 
accatggtga 
ggatttccaa 
cgggactttc 
gtacggtggg 
tggcttatcg 
atccactagt 



gacagttacc 
tccatagttg 
ggccccagtg 
ataaaccagc 
-atccagtcta 
cgcaacgttg 
tqattcagct 
aaagcggtta 
tcactcatgg 
ttttctgtga 
agttgctctt 
gtgctcatca 
agatccagtt 
accagcgttt 
gcgacacgga 
cagggttatt 
ggggttccgc 
ccgatcccct 
atctgct;ccc 
caacaaggca 
cgctgcttcg 
aatagtaatc 
aacttacggt 
taatgacgta 
actatttacg 
cccctattga 
tatgggactt 
t^gcggttttg 
gtctccaccc 
caaaatgtcg 
aggtctatat 
aaattaatac 
aacggccgcc 



aatgcttaat 
cctgactccc 
ctgcaatgat 
cagccggaag 
ttaattgttg 
ttgccattgc 
ccggttccca 
gctccttcgg 
ttatggcagc 
ctggtgagta 
gcccggcgtc 
ttggaaaacg 
cgat:gtaacc 
ctgggtgagc 
aatgttgaat 
gtctcatgag 
gcacatttcc 
atggtcgact 
tgcttgtgtg 
aggcttgacc 
cgatgtacgg 
aatt:acgggg 
aaatggcccg 
tgttcccata 
gtaaactgcc 
cgtcaatgac 
tcctacttgg 
gcagtacatc 
cattgacgtc 
taacaactcc 
aagcagagct 
gactcactat 
agtgtgctgg 



cagtgaggca 

cgtcgtgtag 

accgcgagac 

ggccgagcgc 

ccgggaagct 

tacaggcatc 

acgatcaagg 

tcctccgatc 

actgcataat 

ctcaaccaag 

aatacgggat 

ttcttcgggg 

cactcgtgca 

aaaaacagga 

actcatactc 

cggatacata 

ccgaaaagtg 

ctcagtacaa 

ttggaggtcg 

gacaattgca 

gccagatata 

tcattagttc 

cctggctgac 

gtaacgccaa 

cacttggcag 

ggtaaatggc 

cagtacatct 

aatgggcgtg 

aatgggagtt 

gccccattga 

ctctggctaa 

agggagaccc 



cctatctcag 

ataactacga 

ccacgctcac 

agaagtggtc 

agagtaagta 

gtggtgtcac 

cgagttacat 

gttgtcagaa 

tctcttactig 

tcattctgag 

aataccgcgc 

cgaaaactct 

cccaactgat 

aggcaaaatg 

ttcctttttc 

tttgaatgta 

ccacctgacg 

tctgctctga 

cbgagtagtig 

tgaagaatct 

cgcgttgaca 

atagcccata 

cgcGcaacga 

tagggacttt 

tacatcaagt 

ccgcctggca 

acgtattagt 

gatagcggtt ^ 

tgttttggca ^fl 

cgcaaatggg ^ V 

ctagagaacc {(jStri 

aagcttggta 



nucleotide sequence of pGRS 

1 aattcgccct tctggaattc tttaccgtca tgagtgcaat taaaccagtc atgaagattg 

61 aattggtcat ggaaggagag gtgaacgggc acaagttcac gatcacggga gagggacaag 

121 gcaagcctta cgagggaaca cagactctag accttacagt cactaaaggc gtgccccttic 

181 ctttcgcttt cgatatcttg tcaacagcat tccagtatgg caacagggta tttaccaaat 

241 acccagatga tataccggac tatttcaagc agacctttcc ggaaggatat tcgtgggaaa 

3 01 gaactttcaa atatgaagag ggcgtttgca ccacaaagag tgacataagc ctcaagaaag 

3 61 gccaaccaga ctgctttcaa tataaaatfca actttaaagg ggagaagctt gaccccaacg 

4 21 gcccaattat gcagstagaag accctgaaat. gggagccatc cactgagagg atgtacatgg 
481 acgtggataa agacggtgca aaggtgctga agggcgatgt taatgcggcc ctgttgcttg 
541 aaggaggtgg ccattatcgt tgtgacttta acagtactta caaggcgaag aaaactgtgt 

atatcacttt gtggaccacc gcattgagat tttgagccac aatacggatt 

tacactgtat gaagttgccg tggctcgcaa ttctcctctt cagattatiag 

gcatctagag ggccctattc tatagtgtca cctaaatgct agagctcgct 

gactgtgcct tctagttgcc agccatctgt tgtttgcccc tcccccgtgc 

cctggaaggt gccactccca ctgtcctttc ctaataaaat gaggaaattg 

tctgagtagg tgtcattcta ttctgggggg tggggtgggg caggacagca 

ttgggaagac aatagcaggc atgctgggga tgcggtgggc tctatggctt 

aagaaccagc tggggctcta gggggtatcc ccacgcgccc tgtagcggcg 

ggcgggtgtg gtggttacgc gcagcgtgac cgctacactt gccagcgccc 

tcctttcgct ttcttccctt cctttctcgc cacgttcgcc ggctttcccc 

aaatcggggc atccctttag ggttccgatt tagtgcttta cggcacctcg 

acttgattag ggtgatggtt cacgtagtgg gccatcgccc tgatagacgg 

tttgacgttg gagtccacgt tctttaatag tggactcttg ttccaaactg 

caaccctatc tcggtctatt cttttgattt at:aagggatt ttggggattt 

1441 cggcctattg gttaaaaaat gagctgattt aacaaaaatt taacgcgaat taattctgtg 

1501 gaatgtgtgt cagttagggt gtggaaagtc cccaggctcc ccaggcaggc agaagtatgc 



601 ccttcccagc 
661 acagcaaggt 
721 cctcgagcat 
7 81 gatcagcctc 
841 cttccttgac 
901 catcgcattg 
961 agggggagga 
1021 ctgaggcgga 
1081 cattaagcgc 
1141 tagcgcccgc 
1201 gtcaagctct 
1261 accccaaaaa 
1321 tttttcgccc 
13 81 gaacaacact 
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1561 aaagcatgca tctcaattag tcagcaacca ggtgtggaaa gtccccaggc tccccagcag 
1621 gcagaagtat gcaaagcatg catctcaatt agtcagcaac catagtcccg cccctaactc 
1681 cgcccatccc gcccctaact ccgcccagtt ccgcccattc tccgccccat ggctgactaa 
1741 ttttttttat ttatgcagag gccgaggccg cctctgcctc tgagctattc cagaagtagt 
1801 gaggaggctt ttttggaggc ctaggctttt gcaaaaagct cccgggagct tgtatatcca 
1861 ttttcggatc tgatcaagag acaggatgag gatcgtttcg catgattgaa caagatggat 
1921 tgcacgcagg ttctccggcc gcttgggtgg agaggctatt cggctatgac tgggcacaac 
1981 agacaatcgg ctgctctgat gccsgccgtgt tccggctgtc agcgcagggg cgcccggttc 
2041 tttttgtcaa gaccgacctg tccggtgccc tgaatgaact gcaggacgag gcagcgcggc 
2101 tatcgtggct ggccacgacg ggcgttcctt gcgcagctgt gctcgacgtt gtcactgaag 
2161 cgggaaggga ctggctgcta ttgggcgaag tgccggggca ggatctcctg tcatctcaoc 
2221 ttgctcctgc cgagaaagta tccatcatgg ctgatgcaat gcggcggctg catacgcttg 
2281 atccggctac ctgcccattc gaccaccaag cgaaacatcg catcgagcga gcacgtactc 
2341 ggatggaagc cggtcttgtc gatcaggatg atctggacga agagcatcag gggctcgcgc 
2401 cagccgaact gttcgccagg ctcaaggcgc gcatgcccga cggcgaggat ctcgtcgtga 
2461 cccatggcga tgcctgcttg ccgaatatca tggtggaaaa tggccgcttt tctggattca 
2521 tcgactgtgg ccggctgggt gtggcggacc gctatcagga catagcgttg gctacccgtg 
2581 atattgctga agagcttggc ggcgaatggg ctgaccgctt cctcgtgctt tacggtatcg 
2641 ccgctcccga ttcgcagcgc atcgccttct atcgccttct tgacgagttc ttctgagcgg 
2701 gactctgggg ttcgaaatga ccgaccaagc gacgcccaac ctgccatcac gagatttcga 
2761 ttccaccgcc gccttctatg aaaggttggg cttcggaatc gttttccggg acgccggctg 
gatgatcctc cagcgcgggg atctcatgct ggagttcttc gcccacccca acttgtttat 
2881 tgcagcttat aatggttaca aataaagcaa tagcatcaca aatttcacaa ata€fcagcatt 
2941 tttttcactg cattctagtt gtggtttgtc caaactcatc aatgtatctt atcatgtctg 
3001 tataccgtcg acctctagct agagcttggc gtaatcatgg tcatagctgt ttcctgtgtg 
3061 aaattgttat ccgctcacaa ttccacacaa catacgagcc ggaagcataa agtgtaaagc 
3121 ctggggtgcc taatgagtga gctaactcac attaattgcg ttgcgctcac tgcccgcttt 
3181 ccagtcggga aacctgtcgt gccagctgca ttaatgaatc ggccaacgcg cggggagagg 
3241 cggtttgcgt attgggcgct cttccgcttc ctcgctcact gactcgctgc gctcggtcgt 
33 01 t'cggctgcgg cgagcggtat cagctcactc aaaggcggta atacggttat ccacagaatc 
33 61 aggggataac gcaggaaaga acatgtgagc aaaaggccag caaaaggcca ggaaccgtaa 
3421 aaaggccgcg ttgctggcgt ttttccatag gctccgcccc cctgacgagc atcacaaaaa 
3481 tcgacgctca agtcagaggt ggcgaaaccc gacaggacta taaagatacc aggcgtttcc 
3541 ccctggaagc tccctcgtgc gctctcctgt tccgaccctg ccgcttaccg gatacctgtc 
3601 cgcctttctc ccttcgggaa gcgtggcgct ttctcaatgc tcacgctgta ggtatctcag 
3661 ttcggtgtag gtcgttcgct ccaagctggg ctgtgtgcac gaaccccccg ttcagcccga 
3721 ccgctgcgcc ttatccggta actatcgtct tgagtccaac ccggtaagac acgacttatc 
3781 gccactggca gcagccactg gtaacaggat tagcagagcg aggtatgtag gcggtgctac 
3841 agagttcttg aagtggtggc ctaactacgg ctacactaga aggacagtat ttggtatctg 
3901 cgctctgctg aagccagtta ccttcggaaa aagagttggt agctcttgat ccggcaaaca 
3961 aaccaccgct ggtagcggtg gtttttttgt ttgcaagcag cagattacgc gcagaaaaaa 
4021 aggatctcaa gaagatcctt tgatcttttc tacggggtct gacgctcagt ggaacgaaaa 
4081 ctcacgttaa gggattttgg tcatgagatt atcaaaaagg atcttcacct agatcctttt 
4141 aaattaaaaa tgaagtttta aatcaatcta aagtatatat gagtaaactt ggtctgacag 
4201 ttaccaatgc ttaatcagtg aggcacctat ctcagcgatc tgtctatttc gttcatccat 
4261 agttgcctga ctccccgtcg tgtagataac tacgatacgg gagggcttac catctggccc 
4321 cagtgctgca atgataccgc gagacccacg ctcaccggct ccagatttat cagcaataaa 
4381 ccagccagcc ggaagggccg agcgcagaag tggtcctgca actttatccg cctccatcca 
4441 gtctattaat tgttgccggg aagctagagt aagtagttcg ccagttaata gtttgcgcaa 
4501 cgttgttgcc attgctacag gcatcgtggt gtcacgctcg tcgtttggta tggcttcatt 
4561 cagctccggt tcccaaogat caaggcgagt tacatgatcc cccatgttgt gcaaaaaagc 
4621 ggttagctcc ttcggtcctc cgatcgttgt cagaagtaag ttggccgcag tgttatcact 
4681 catggttatg gcagcactgc ataattctct tactgtcatg ccatccgtaa gatgcttttc 
4741 tgtgactggt gagtactcaa ccaagtcatt ctgagaatag tgtatgcggc gaccgagttg 
4801 ctcttgcccg gcgtcaatac gggataatac cgcgccacat agcagaactt taaaagtgct 
4861 catcattgga aaacgttctt cggggcgaaa actctcaagg atcttaccgc tgttgagatc 
4921 cagttcgatg taacccactc gtgcacccaa ctgatcttca gcatctttta ctttcaccag 
4981 cgtttctggg tgagcaaaaa caggaaggca aeuitgccgca aaaaagggaa taagggcgac 
5041 acggaaatgt tgaatactca tactcttcct ttttcaatat tattgaagca tttatcaggg 
5101 ttattgtctc atgagcggat acatatttga atgtatttag aaaaataaac aaataggggt 
5161 tccgcgcaca tttccccgaa aagtgccacc tgacgtcgac ggatcgggag atctcccgat 
5221 cccctatggt cgactctcag tacaatctgc tctgatgccg catagttaag ccagtatctg 
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5281 ctccctgctt gtgtgttgga ggtcgctgag tagtgcgcga gcaaaattta agctacaaca 

5341 aggcaaggct tgaccgacaa ttgcatgaag aatctgctta gggttaggcg ttttgcgctg 

5401 cttcgcgatg tacgggccag atatacgcgt tgacattgat tattgactag ttattaatag 

5461 taatcaatta cggggtcatt agttcatagc ccatatatgg agttccgcgt tacataactt 

5521 acggtaaatg gcccgcctgg ctgaccgccc aacgaccccc gcccattgac gtcaataatg 

5581 acgtatgttc ccatagtaac gccaataggg actttccatt gacgtcaatg ggtggactat 

5641 ttacggtaaa ctgcccactt ggcagtacat caagtgtatc atatgccaag tacgccccct 

5701 attgacgtca atgacggtaa atggcccgcc tggcattatg cccagtacat gaccttatgg // 

5761 gactttccta cttggcagta catctacgta btagtcatcg ctattaccat ggtgatgcgg 1^ ' 

5821 ttttggcagt acatcaatgg gcgtggatag cggtttgact cacggggatt tccciagtctc /" ^ iVl 

5881 caccccattg acgtcaatgg gagtttgttt tggcaccaaa atcaacggga ctttccaaaa ( CA^f 

5941 tgtcgtaaca actccgcccc attgacgcaa atgggcggta ggcgtgtacg gtgggaggtc 

60O1 tatataagca gagctctctg gctaactaga gaacccactg cttactggct tatcgaaatt 

6061 aatacgactc actataggga gacccaagct tggtaccgag ctcggatcca ctagtaacgg 

6121 ccgccagtgt gctgg 



Nucleotide sequence of pGR6 

1 tcgagatgca tggccggccg agctccgcat cggccgctgt catcagatcg ccatctcgcg 

61 cccgtgcctc tgacttctaa gtccaattac tcttcaacat ccctacatgc tctttctccc 
121 tgtgctccca ccccctattt ttgttattat caaaaaaact tcttcttaat ttctttgttt 
Igl^ tttagcttct tttaagtcac ctctaacaat gaaattgtgt agattcaaaa atagaattaa 
241 ttcgtaataa aaagtcgaaa aaaattgtgc tccctccccc cattaataat aattctatcc 
301 caaaatctac acaatgttct gtgtacacct cttatgtttt ttttacttct gataaatttt 
361 ttttgaaaca tcatagaaaa aaccgcacac aaaatacctt atcatatgtt acgtttcagt 
421 ttatgaccgc aatttttatt tcttcgcacg tctgggcctc tcatgacgtc aaatcatgct 
4 81 catcgtgaaa aagttttgga gtatttttgg aatttttcaa tcaagtgaaa gtttatgaaa 
541 ttaattttcc tgcttttgct ttttgggggt ttcccctatt gtttgtcaag agtttcgagg 
601 acggcgtttt tcttgctaaa atcacaagta ttgatgagca cgatgcaaga aagatcggaa 
661 gaaggtttgg gtttgaggct cagtggaagg tgagtagaag ttgataattt gaaagtggag 
721 tagtgtctat ggggtttttg ccttaaatga cagaatacat tcccaatata ccaaacataa 
781 ctgtttccta ctagtcggcc gtacgggccc tttcgtctcg cgcgtttcgg tgatgacggt 
841 gaeiaacctct gacacatgca gctcccggag acggtcacag cttgtctgta agcggatgcc 
SLOl gggagcagac aagcccgtca gggcgcgtca gcgggtgttg gcgggtgtcg gggctggctt 
961 aactatgcgg catcagagca gattgtactg agagtgcacc atatgcggtg tgaaataccg 
1021 cacagatgcg taaggagaaa ataccgcatc aggcggcctt aagggcctcg tgatacgcct 
1081 atttttatag gttaatgtca tgataataat ggtttcttag acgtcaggtg gcacttttcg 
1141 gggaaatgtg cgcggaaccc ctatttgttt atttttctaa atacattcaa atatgtatcc 
1201 gctcatgaga caataaccct gataaatgct tcaataatat tgaaaaagga agagtatgag 
1261 tattcaacat ttccgtgtcg cccttattcc cttttttgcg gcattttgcc ttcctgtttt 
1321 tgctcaccca gaaacgctgg tgaaagtaaa agatgctgaa gatcagttgg gtgcacgagt 
1381 gggttacatc gaactggatc tcaacagcgg taagatcctt gagagttttc gccccgaaga 
1441 acgttttcca atgatgagca cttttaaagt tetgctatgt ggcgcggtat tatcccgtat 
1501 tgacgccggg caagagcaac tcggtcgccg catacactat tctcagaatg acttggttga 
1561 gtactcacca gtcacagaaa agcatcttac ggatggcatg acagtaagag aattatgcag 
1621 tgctgccata accatgagtg ataacactgc ggccaactta cttctgacaa cgatcggagg 
1681 accgaaggag ctaaccgctt ttttgcacaa catgggggat catgtaactc gccttgatcg 
1741 ttgggaaccg gagctgaatg aagccatacc aaacgacgag cgtgacacca cgatgcctgt 
1801 agcaatggca acaacgttgc gcaaactatt aactggcgaa ctacttactc tagcttcccg 
1861 gcaacaatta atagactgga tggaggcgga taaagttgca ggaccacttc tgcgctcggc 
1921 ccttccggct ggctggttta ttgctgataa atctggagcc ggtgagcgtg ggtctcgcgg 
1981 tatcattgca gcactggggc cagatggtaa gccc±cccgt atcgtagkta tctacacgac 
2041 ggggagtcag gcaactatgg atgaacgaaa tagacagatc gctgagatag gtgcctcact 
2101 gattaagcat tggtaactgt cagaccaagt ttactcatat atactttaga ttgatttaaa 
2161 acttcatttt taatttaaaa ggatctaggt gaagatcctt tttgataatc tcatgaccaa 
2221 aatcccttaa cgtgagtttt cgttccactg agcgtcagac cccgtagaaa agatcaaagg 
2281 atctkcttga gatccttttt ttctgcgcgt aatctgctgc ttgcaaacaa aaaaaccacc 
2341 gctaccagcg gtggtttgtt tgccggatca agagctacca actctttttc cgaaggtaac 
2401 tggcttcagc agagcgcaga taccaaatac tgtccttcta gtgtagccgt agttaggcca 
2461 ccacttcaag aactctgtag caccgcctac atacctcgct ctgctaatcc tgttaccagt 
2521 ggctgctgcc agtggcgata ag.tcgtgtct taccgggttg gactcaagac gatagttacc 
2581 ggataaggcg cagcggtcgg gctgaacggg gggttcgtgc acacagccca gcttggagcg 
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2641 aacgacctac accgaactga gatacctaca gcgtgagcat tgagaaagcg ccacgcttcc 
2701 cgaagggaga aaggcggaca ggtatccggt aagcggcagg gtcggaacag gagagcgcac 

2761 gagggagctt ccagggggaa acgcctggta tctttatagt cctgtcgggt ttcgccacct 

2821 ctgacttgag cgtcgatttt tgtgatgctc gtcagggggg cggagcctat ggaaaaacgc 

2881 cagcaacgcg gcctttttac ggttcctggc cttttgctgg ccttttgctc acatgttctt 
2941 tcctgcgtta tcccctgatt ctgtggataa ccgtattacc gcctttgagt gagctgatac 
3001 cgctcgccgc agccgaacga ccgagcgcag cgagtcagtg agcgaggaag cggaagagcg 

3061 cccaatacgc aaaccgcctc tccccgcgcg ttggccgatt cattaatgca gctggcacga 

3121 caggtttccc gactggaaag cgggcagtga gcgcaacgca attaatgtga gttagctcac 

3181 tcattaggca ccccaggctt tacactttat gcttccggct cgtatgttgt gtggaattgk 

3241 gagcggataa caatttcaca caggaaacag ctatgaccat gattacgcca agcttgcatg 

3301 cctgcaggtc gactctagag gatccccagc ttgcatgcct gcaggtcgag gcatttgaat 

3361 tgggggtggt ggacagtaac tgtctgtaat aataattact cctgaccagg ttgcaattcg 

3421 agttttgata agcataatta taccttgtac attgtgggtt ttgtgctgtg gacgttttat 

3481 tgtggacatc cccataagct acaagaaacc aaaaatgaaa ttaaaagtat tgaaaaacgt 

3541 cgtaacattt tatatctgag tagtatcctt tgctttaaat gtccataaaa ataattttat 

3601 aatcaataaa acaacgtttg taaatcaact gagtttacaa gtagagacat tgagggatac 

3661 tttcactatg ctaaagtgaa taatcgacca aataataact cactttggta tttattcctg 

3721 tcttataatg ttatgtatga attaaattca tatgcatatg gctcactctg acaaaaaaaa 

3781 ataatcttcc agatcaatat tgactaccga tgcgggtggt cttttgcttt gaattctgct 

3 841 gaactttaca ccccgaacag caatgtgtgc ttcagctaaa aaaaagtaag tgtgttaatc 
3^Qtl,,agtccccccg attcttcatt ttttgcccct ctctcccgtt tcgtcggcaa aagaagagaa 
3^51 aataaagata agtctcaaga taggttggta atcgctaaag tggttgtgtg g qjiffag agta 

4 021 gcaaaatggc aggaagagca ctttgcgcgc acacactgta ctcattgttc tggataaaat 
4081 tctctcgttg tttgccgtcg gatgtctgcc tctctgccat tgagccggct tcttcactat 
4141 ctttagttaa cctaaaatgc cgtttctttt ctcgtatccc actatccgtt gaggttctct 
4201 gctctcttcg ctcccttacc gccagcgagc aactatccgt gggggcgcct tgctcggaag 
4261 atggggggga agaaagaaga tttttgctat ttgcacttga gaaagagact tttcctgcgt 
4321 cgatggttag agaacagtgt gcagacactt ttcagctacc tagatacatg gatatccccg 
43 81 cctcccaatc cacccaccca gggaaaaaga agggctcgcc gaaaaatcaa agttatctcc 
4441 aggctcgcgc atcccaccga gcggttgact tctctccacc acttttcatt ttaaccctcg 
4501 gggtacggga ttggccaaag gacccaaagg tatgtttcga atgatactaa cataacatag 
4561 aacattttca ggaggaccct tggctagcgt cgacggtacc atggggcgcg ccaccaccat 
4631 gagtgcaatt aagccagtta tgaagattga attggtcatg gaaggagagg tgaacgggca 
4 681 caagttcacg atcacgggag agggacaagg caagccttac gagggaacac agactctaaa 
4741 ccttacagtc actaaaggcg tgccccttcc tttcgctttc gatatcttgt caacagcatt 
4801 ccagtatggc aacagggtat ttaccaaata cccagatgat ataccggact atttcaagca 
4861 gacctttccg gaaggatatt cgtgggaaag aactttcaaa tatgaagagg gcgtttgcac 
4921 cacaaagagt gacataagcc tcaagaaagg ccaaccagac tgctttcaat ataaaattaa 
4981 ctttaaaggg gagaagcttg accccaacgg cccaattatg cagaagaaga ccctgaaatg 
5041 ggagccatcc actgagagga tgtacatgga tgtggataaa gacggtgcaa aggtgctgaa 
5101 gggcgatgtt aatgcggccc tgttgcttga aggaggcggc cattatcgtt gtgactttaa 
5161 cagtacttac aaggcgaaga aaactgtgtc cttcccagca tatcactttg tggaccaccg 
5221 cattgagatt ttgagccaca atacggatta cagcaaggtt acactgtatg aagttgccgt 
5281 ggctcgcaat tctcctcttc agattatggc gccccagtaa aggcttaacg aaaagccaag 
5341 actc 

Nucleotide sequence of pGR7 

1 aattctatta ctttgagtct accatcatga gtgcaattaa accagtcatg aagattgaat 

61 tggtcatgga aggagaggtg aacgggcaca agttcacgat cacgggagag ggacaaggca 

121 agccttacga gggaacacag actctaaacc ttacagtcac taaaggcgtg ccccttcctt 

181 tcgctttcga tatcttgtca acagcattcc agtatggcaa cagggtattt accaaatacc 

241 cagatgatat accggactat ttcaagcaga cctttccgga aggatattcg tgggaaagaa 

301 ctttcaaata tgaagagggc gtttgcacca caaagagtga cataagcctc aagaaaggcc 

361 aaccagactg ctttcaatat aaaattaact ttaaagggga gaagcttgac cccaacggcc 

421 caatkatgcg gaagaagacc ctgaaatggg agccatccac tgagaggatg tacatggatg 

481 tggataaaga cggtgcaaag gtgctgaagg gcgatgttaa tgcggccctg ttgcttgaag 

541 gaggcggcca ttatcgttgt gactttaaca gtacttacaa ggcgaagaaa actgtgtcct 

601 tcccagcata tcactttgtg gaccaccgca ttgagatttt gagccacaat acggattaca 

661 gcaaggttac actgtatgaa gttgccgtgg ctcgcaattc tcctcttcag attatggogc 

721 cccagtaaag gcttaacgaa aagccaagac gctcgagcac caccaccacc accactgaga 
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781 tccggctgct aacaaagccc gaaaggaagc tgagttggct gctgccaccg ctgagcaeita' 
B41 actagcataa ccccttgggg cctctaaacg ggtcttgagg ggttttttgc tgaaaggagg 
901 aactatatcc ggattggcga atgggacgcg ccctgtagcg gcgcattaag cgcggcgggt 
961 gtggtggtta cgcgcagcgt gaccgctaca cttgccagcg ccctagcgcc cgctcctttc 
1021 gctttcttcc cttcctttct cgccacgttc gccggctttc cccgtcaage tctaaatcgg 
1081 gggctccctt tagggttccg atttagtgct ttacggcacc tcgaccccaa aaaacttgat 
1141 tagggtgatg gttcacgtag tgggccatcg ccctgataga cggtttttcg ccctttgacg 
1201 ttggagtcca cgttctttaa tagtggactc ttgttccaaa ctggaacaac actcaaccct 
1261 atctcggtct attcttttga tttataaggg attttgccga tttcggccta ttggttaaaa 
1321 aatgagctga tttaacaaaa atttaaqgcg aattttaaca aaatattaac gtttacaatt 
1381 tcaggtggca cttttcgggg aaatgtgcgc ggaaccccta tttgtttatt tttctaaata 
1441 cattcaaata tgtatccgct catgagacaa taaccctgat aaatgcttca ataatattga 
1501 aaaaggaaga gtatgagtat tcaacatttc cgtgtcgccc ttattccctt ttttgcggca 
1561 ttttgccttc ctgtttttgc tcacccagaa acgctggtga aagtaaaaga tgctgaagat 
1621 cagttgggtg cacgagtggg ttacatcgaa ctggatctca acagcg^taa gatccttgag 
1681 agttttcgcc ccgaagaacg ttittccaatg atgagcactt ttaaagttct gctatgtggc 
1741 gcggtattat cccgtattga cgccgggcaa gagcaactcg gtcgccgcat acactattct 
1801 cagaatgact tggttgagta ctcaccagtc acagaaaagc atcttacgga tggcatgaca 
1861 gtaagagaat tatgcagtgc tgccataacc atgagtgata acactgcggc caacttactt 
1921 ctgacaacga tcggaggacc gaaggagcta accgcttttt tgcacaacat gggggatcat 
1981 gtaactcgcc ttgatcgttg ggaaccggag ctgaatgaag ccataccaaa cgacgagcgt 
2041 gacaccacga tgcctgcagc aatggcaaca acgttgcgca aactattaac tggcgaacta 
ii'oi cttactctag cttcccggca acaattaata gactggatgg aggcggataa- agt;tgcagga 
2161 ccacttctgc gctcggccct tccggctggc tggtttattg ctgataaatc tggagccggt 
2221 gagcgtgggt ctcgcggtat cattgcagca ctggggccag atggtaagcc ctcccgtatc 
2281 gtagttatct acacgacggg gagtcaggca actatggatg aacgaaatag acagatcgct 
2341 gagataggtg cctcactgat taagcattgg taactgtcag accaagttta ctcatatata 
2401 ctttagattg atttaaaact tcatttttaa tttaaaagga tctaggtgaa gatccttttt 
2461 gataatctca tgaccakaat cccttaacgt gagttttcgt tccactgagc gtcagacccc 
2521 gtagaaaaga tcaaaggatc ttcttgagat cctttttttc tgcgcgtaat ctgctgcttg 
2581 caaacaaaaa aaccaccgct accagcggtg gtttgtttgc cggatcaaga gctaccaact 
2641 ctttttccga aggtaactgg cttcagcaga gcgcagatac caaatactgt ccttctagtg 
2701 tagccgtagt taggccacca cttcaagaac tctgtagcac cgcctacata cctcgctctg 
275.1 ctaatcctgt taccagtggc tgctgccagt ggcgataagt cgtgtcttac cgggttggac 
2821 tcaagacgat agttaccgga taaggcgcag cggtcgggct gaacgggggg ttcgtgcaca 
2881 cagcccagct tggagcgaac gacctacacc gaactgagat acctacagcg tgagctatga 
2941 gaaagcgcca cgcttcccga agggagaaag gcggacaggt atccggtaag cggcagggtc 
3001 ggaacaggag agcgcacgag ggagcttcca gggggaaacg cctggtatct ttatagtcct 
3061 gtcgggtttc gccacctctg acttgagcgt cgatttttgt gatgctcgtc aggggggcgg 
3121 agcctatgga aaaacgccag caacgcggcc tttttacggt tcctggcctt ttgctggcct 
3181 tttgctcaca tgttctttcc tgcgttatcc cctgattctg tggataaccg tattaccgcc 
3241 tttgagtgag ctgataccgc tcgccgcagc cgaacgaccg agcgcagcga gtcagtgagc 
3 3 01 gaggaagcgg aagagcgcct gatgcggtat tttctcctta cgcatctgtg cggtatttca 

33 61 caccgcatat atggtgcact ctcagtacaa tctgctctga tgccgcatag ttaagccagt 
3421 atacactccg ctatcgctac gtgactgggt catggctgcg ccccgacacc cgccaacacc 

34 81 cgctgacgcg ccctgacggg cttgtctgct cccggcatcc gcttacagac aagctgtgac 
3541 cgtctccggg agctgcatgt gtcagaggtt ttcaccgtca tcaccgaaac gcgcgaggca 
3601 gctgcggtaa agctcatcag cgtggtcgtg aagcgattca cagatgtctg cctgttcatc 
3661 cgcgtccagc tcgttgagtt tctccagaag cgttaatgtc tggcttctga taaagcgggc 
3721 catgttaagg gcggtttttt cctgtttggt cactgatgcc tccgtgtaag ggggatttct 
3781 gttcatgggg gtaatgatac cgatgaaacg agagaggatg ctcacgatac gggttactga 
3841 tgatgaacat gcccggttac tggaacgttg tgagggtaaa caactggcgg tatggatgcg 
3901 gcgggaccag agaaaaatca ctcagggtca atgccagcgc ttcgttaata cagatgtagg 
3961 tgttccacag ggtagccagc agcatcctgc gatgcagatc cggaacataa tggtgcaggg 
4021 cgctgacttc cgcgtttcca gactttacga aacacggaaa ccgaagacca ttcatgttgt 
4 OBI tgctcaggtc gcagacgttt tgcagcagca gtcgcttcac gttcgctcgc gtatcggtga 
4141 ttcattctgc taaccagtaa ggcaaccccg ccagcctagc cgggtcctca acgacaggag 
4201 cacgatcatg cgcacccgtg gggccgccat gccggcgata atggcctgct tctcgccgaa 

42 61 acgtttggtg gcgggaccag tgacgaaggc ttgagcgagg gcgtgcaaga ttccgaatac 
4321 cgcaagcgac aggccgatca tcgtcgcgct ccagcgaaag cggtcctcgc cgaaaatgac 

43 81 ccagagcgct gccggcacct gtcctacgag ttgcatgata aagaagacag tcataagtgc 
4441 ggcgacgata gtcatgcccc gcgcccaccg gaaggagctg actgggttga aggctctcaa 
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4501 gggcatcggt cgagatcccg gtgcctaatg agtgagctaa cttacattaa ttgcgttgcg 

4561 ctcactgccc gctttccagt cgggaaacct gtcgtgccag ctgcattaat gaatcggcca 

4621 acgcgcgggg agaggcggtt tgcgtattgg gcgccagggt ggtttttctt ttcaccagtg 

46 Bl agacgggcaa cagctgattg cccttcaccg cctggccctg agagagttgc agcaagcggt 

4741 ccacgctggt ttgccccagc aggcgaaaat cctgtttgat ggtggttaac ggcgggatat I 

4801 aacatgagct gtcttcggta tcgtcgtatc ccactaccga gatgticcgca ccaacgcgca i 

4861 gcccggactc ggtaatggcg' cgcattgcgc ccagcgccat ctgatcgttg gcaaccagca J 

4921 tcgcagtggg aacgatgccc tcattcagca tttgcatggt ttgttgaaaa ccggacatgg i 

4981 cactccagtc gccttcccgt tccgctatcg gctgaatttg attgcgagtg agatatttat [ 

5041 gccagccagc cagaogcaga cgcgccg^ga cagaacttaa tgggcccgct aacagcgcga j 

5101 tttgctggtg acccaatgcg accagatgct ccacgcccag tcgcgtaccg tcttcatggg ! 

5161 agaaaataat actgttgatg ggtgtctggt cagagacatc aagaaataac gccggaacat j 

5221 tagtgcaggc agcttccaca gcaatggcat cctggtcatc cagcggatag ttaatgatca ' 

5281 gcccactgac gcgttgcgcg agaagattgt gcaccgccgc tttacaggct tcgacgccgc j 

5341 ttcgttctac catcgacacc accacgctgg cacccagttg atcggcgcga gatttaatcg j 

5401 ccgcgacaat ttgcgacggc gcgtgcaggg ccagactgga ggtggcaacg ccaatcagca ^ | 

5461 acgactgttt gcccgccagt tgttgtgcca cgcggttggg aatgtaattc agctccgcca j 

5521 tcgccgcttc cactttttcc cgcgttttcg cagaaacgtg gctggcctgg ttcaccacgc • 

5581 gggaaacggt ctgataagag acaccggcat actctgcgac atcgtataac gttactggtt i 

5641 tcacattcac caccctgaat tgactctctt ccgggcgcta tcatgccata ccgcgaaagg 

5701 ttttgcgcca ttcgatggtg tccgggatct cgacgctctc ccttatgcga ctcctgcatt 

57gl, aggaagcagc ccagtagtag gttgaggccg ttgagcaccg ccgccgcaag gaatggtgca 

5821 tgcaaggaga tggcgcccaa cagtcccccg gccacggggc ctgccaccat aoccacgccg 

5881 aaacaagcgc tcatgagccc gaagtggcga gcccgatctt ccccatcggt gatgtcggcg ; 

5941 atataggcgc cagcaaccgc acctgtggcg ccggtgatgc cggccacgat gcgtccggcg 

6001 tagaggatcg agatcgatct cgatcccgcg aaattaatac gactcactat aggggaattg 

6061 tgagcggata acaattcccc tctagaaata attttgttta actttaagaa ggagatatac 

6121 atatgagcga taaaattatt cacctgactg acgacagttt tgacacggat gtactcaaag 

6181 cggacggggc gatcctcgtc gatttctggg cagagtggtg cggtccgtgc aaaatgatcg 

6241 ccccgattct ggatgaaatc gctgacgaat atcagggcaa actgaccgtt gcaaaactga i 

63 01 acatcgatca aaaccctggc actgcgccga aatatggcat ccgtggtatc ccgactctgc j 

63 61 tgctgttcaa aaacggtgaa gtggcggcaa ccaaagtggg tgcactgtct aaaggtcagt 

6421 tgaaagagtt cctcgacgct aacctggccg gttctggttc tggccatatg caccatcatc 

64^;i atcatcattc ttctggtctg gtgccacgcg gttctggtat gaaagaaacc gctgctgcta 

6541 aattcgaacg ccagcacatg gacagcccag atctgggtac cgacgacgac gacaaggcca 

6601 tggctgatat cggatccg 

nucleotide sequence of pDW2700 

1 gatccccagc ttgcatgcct gcaggtcgag gcatttgaat tgggggtggt ggacagtaac j 
61 tgtctgtaat aataattact cctgaccagg ttgcaattcg agttttgata agcataatta j 
121 taccttgtac attgtgggtt ttgtgctgtg gacgttttat tgtggacatc cccataagct i 
181 acaagaaacc aaaaatgaaa ttaaaagtat tgaaaaacgt cgtaacattt tatatctgag j 
241 tagtatcctt tgcttJtaaat gtccataaaa ataattttat aatcaataaa acaacgtttg ! 
3 01 taaatcaact gagtttacaa gtagagacat tgagggatac tttcactatg ctaaagtgaa 
361 taatcgacca aataataact cactttggta tttattcctg tcttataatg ttatgtatga 
421 attaaattca tatgcatatg gctcactctg acaaaaaaaa ataatcttcc agatcaatat 
481 tgactaccga tgcgggtggt cttttgcttt gaattctgct gaactttaca ccccgaacag 
541 caatgtgtgc ttcagctaaa aaaaagtaag tgtgttaatc agtccccccg attcttcatt 
601 ttttgcccct ctctcccgtt tcgtcggcaa aagaagagaa aataaagata agtctcaaga 
661 taggttggta atcgctaaag tggttgtgtg gataagagta gcaaaatggc aggaagagca 
721 ctttgcgcgc acacactgta ctcattgttc tggataaaat tctctcgttg tttgccgtcg 
781 gatgtctgcc tctctgccat tgagccggct tcttcactat ctttagttaa cctaaaatgc { 
841 cgtttctttt ctcgtatccc actatccgtt gaggttctct gctctcttcg ctcccttacc , 
901 gccagcgagc aactatccgt gggggcgcct tgctcggaag atggggggga agaaagaaga \' 
961 tttttgctat ttgcacttga gaaagagact tttcctgcgt cgatggttag agaacagtgt ! 
1021 gcagacactt ttcagctacc tagatacatg gatatccccg cctcccaatc cacccaccca j 
1081 gggaaaaaga agggctcgcc gaaaaatcaa agttatctcc aggctcgcgc atcccaccga j 
1141 gcggttgact tctctccacc acttttcatt ttaaccctcg gggtacggga ttggccaaag ; 
1201 gacccaaagg tatgtttcga atgatactaa cataacatag aacattttca ggaggaccct 
1261 tggctagcgt cgacggtacc atggggcgcg ccgaattcgt taactgatca ctcgagatgc 
1321 atggccggcc gagctccgca tcggccgctg tcatcagatc gccatctcgc gcccgtgcct 
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1381 ctgacttcta agtccaatta ctcttcaaca tccctacatg ctctttctcc ctgtgctccc 
1441 accccctatt tttgttatta tcaaaaaaac ttcttcttaa tttctttgtt ttttagcttc 
1501 ttttaagtca cctctaacaa tgaaattgtg tagattcaaa aatagaatta attcgtaata 
1561 aaaagtcgaa aaaaattgtg ctccctcccc ccattaataa taattctatc ccaaaatcta 
1621 cacaatgttc tgtgtacact tcttatgttt tttttacttc tgataeiattt tttttgaaac 
1681 atcatagaaa aaaccgcaca caaaatacct tatcatatgt tacgtttcag tttatgaccg 
1741 caatttttat ttcttcgcac gtctgggcjct ctcatgacgt caaatcatgc tcatcgtgaa 
1801 aaagttttgg agtatttttg gaatttttca atcaagtgaa agtttatgaa attaattttc 
1861 ctgcttttgc tttttggggg tttcccctat tgtttgtcaa gagtttcgag gacggcgttt 
1921 ttcttgctaa aatcacaagt attgatgagc acgatgcaag aaagatcgga agaaggtttg 
1981 ggtttgaggc tcagtggaag gtgagtagaa gttgataatt tgaaagtgga gtagtgtcta 
2041 tggggttttt gccttaaatg acagaataca ttcccaatat accaaacata actgtttcct 
2101 actagtcggc cgtacgggcc ctttcgtctc gcgcgtttcg gtgatgacgg tgaaaacctc 
2161 tgacacatgc agctcccgga gacggtcaca gcttgtctgt aagcggatgc cgggagcaga 
2221 caagcccgtc agggcgcgtc agcgggtgtt ggcgggtgtc ggggctggct taactatgcg 
22 81 gcatcagagc agattgtact gagagtgcac catatgcggt gtgaaatacc gcacagatgc 
2341 gtaaggagaa aataccgcat caggcggcct taagggcctc gtgatacgcc tatttttata 
2401 ggttaatgtc atgataataa tggtttctta gacgtcaggt ggcacttttc ggggaaatgt 
2461 gcgcggaacc cctatttgtt tatttttcta aatacattca aatatgtatc cgctcatgag 
2521 acaataaccc tgataaatgc ttcaataata ttgaaaaagg aagagtatga gtattcaaca 
2581 tttccgtgtc gcc'cttattc ccttttttgc ggcattttgc cttcctgttt ttgctcaccc 
2JSA1, agaaacgctg gtgaaagtaa aagatgctga agatcagttg ggtgcacgag tgggttacat 
2701 cgaactggat ctcaacagcg gtaagatcct tgagagtttt cgccccgaag aacgttttcc 
2761 aatgatgagc acttttaaag ttctgctatg tggcgcggta ttatcccgta ttgacgccgg 
2821 gcaagagcaa ctcggtcgcc gcatacacta ttctcagaat gacttggttg agtactcacc 
2881 agtcacagaa aagcatctta cggatggcat gacagtaaga gaattatgca gtgctgccat 
2941 aaccatgagt gataacactg cggccaactt acttctgaca acgatcggag gaccgaagga 
3001 gctaaccgct tttttgcaca acatggggga tcatgtaact cgccttgatc gttgggaacc 
3 061 ggagctgaat gaagccatac caaacgacga gcgtgacacc acgatgcctg tagcaatggc 
3121 aacaacgttg cgcaaactat taactggcga actacttact ctagcttccc ggcaacaatt 
3181 aatagactgg atggaggcgg ataaagttgc aggaccactt ctgcgctcgg cccttccggc 
3241 tggctggttt attgctgata aatctggagc cggtgagcgt gggtctcgcg gtatcattgc 
3301 agcactgggg ccagatggta agccctcccg tatcgtagtt atctacacga cggggagtca 
3>61 ggcaactatg gatgaacgaa atagacagat cgctgagata ggtgcctcac tgattaagca 
3421 ttggtaactg tcagaccaag tttactcata tatactttag attgatttaa aacttcattt 
3481 ttaatttaaa aggatctagg tgaagatcct ttttgataat ctcatgacca aaatccctta 
3541 acgtgagttt tcgttccact gagcgtcaga ccccgtagaa aagatcaaag gatcttcttg 
3601 agatcctttt tttctgcgcg taatctgctg cttgcaaaca aaaaaaccac cgctaccagc 
3661 ggtggtttgt ttgccggatc aagagctacc aactcttttt ccgaaggtaa ctggcttcag 
3721 cagagcgcag ataccaaata ctgtccttct agtgtagccg tagttaggcc accacttcaa 
3781 gaactctgta gcaccgccta catacctcgc tctgctaatc ctgttaccag tggctgctgc 
3841 cagtggcgat aagtcgtgtc ttaccgggtt ggactcaaga cgatagttac cggataaggc 
3901 gcagcggtcg ggctgaacgg ggggttcgtg cacadagccc agcttggagc gaacgaccta 
3961 caccgaactg agatacctac agcgtgagca ttgagaaagc gccacgcttc ccgaagggag 
4021 aaaggcggac aggtatccgg taagcggcag ggtcggaaca ggagagcgca cgagggagct 
4081 tccaggggga aacgcctggt atctttatag tcctgtcggg tttcgccacc tctgacttga 
4141 gcgtcgattt ttgtgatgct cgtcaggggg gcggagccta tggaaaaacg ccagcaacgc 
4201 ggccttttta cggttcctgg ccttttgctg gccttttgct cacatgttct ttcctgcgtt 
4261 atcccctgat tctgtggata accgtattac cgcctttgag tgagctgata ccgctcgccg 
4321 cagccgaacg accgagcgca gcgagtcagt gagcgaggaa gcggaagagc gcccaatacg 
4381 caaaccgcct ctccccgcgc gttggccgat tcattaatgc agctggcacg acaggtttcc 
4441 cgactggaaa gcgggcagtg agcgcaacgc aattaatgtg agttagctca ctcattaggc 
4501 accccaggct ttacacttta tgcttccggc tcgtatgttg tgtggaattg tgagcggata 
4561 acaatttcac acaggaaaca gctatgacca tgattacgcc aagcttgcat gcctgcaggt 
4621 cgactctaga g 
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Emission spectrum of Thioredoxin-FP-tiisio'rv - - 
protein from pGR3 at 452nm excitation 
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Emission spectrum of Tiiioredoxin-FP-fusion 
protein from pGR3 at 489nm excitation 
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Emission spectrum of Thioredoxin-FP-fusion 
protein from pGR3 at 469nm excitation 
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Fig 17 



pGR3 excitation spectrum at 490nm emission 
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Excitation spectrum of Thioredoxin-FP-iusion 
protein from pGR7 at490nm emission 
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Emission spectrum of Thioredoxin^P-fusion 
protein from pGR7 at452nm excitation 
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Emission and Excitation spectra of Thioredoxin*FP- 
Fusion protein from pGR7 
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Fig 21 



Emission spectrum of Thioredoxin-*FP-Fuslon 
protein from pGR13 at4S2nm excitation 
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Emission spectrum of Thioredoxin-FP-Fuslon 
protein from pGR13 at 469nm excitation 
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Excitation spectrum of Thioredoxin-FP-Fusion 
proteins from pGR13 at 490nm emission 
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Emission spectrum of Ttiiofedoxin-FP-Fusion protein pGRIS at 451nnri 
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